DEV Community

Cover image for Secret Message

Secret Message

Ryan Palo on October 17, 2017

Today I was reading through an article on Dev.to by Ben Greenberg about an interview coding challenge, and I got hooked and had to try it for mysel...
Collapse
 
bengreenberg profile image
Ben Greenberg

This is great! I can't even begin to describe my love for Ruby. It is such a clean language.

Collapse
 
imox2 profile image
Talha • Edited
    def give_me_word(text):
        ans=dict()
        the_text=""
        for i in text:
            if i in ans:
                ans[i]+=1
            else:
                ans[i]=1

        comparison=ans['_']
        sorted_list=sorted(ans.values(),reverse=True)
        rev_dict = dict((v,k) for k,v in ans.iteritems())

        for i in sorted_list:
            if(i>=comparison and rev_dict[i]!='_'):
                the_text=the_text+rev_dict[i]
            elif rev_dict[i]=='_':
                break

        return the_text

    text="the_text"

    print give_me_word(text)

Not fully optimized script to do the same in python

Collapse
 
rpalo profile image
Ryan Palo • Edited

Nice! Thanks for sharing. :) Have you seen the built-in Counter class? This seems like a perfect use-case for it.

from collections import Counter

blob = "..."
letters = "abcdefghijklmnopqrstuvwxyz_"

def message(letters, blob):
    hist = Counter(blob)    # Counter takes any iterable and makes
                            # a histogram of its items!

    sorted_letters = ''.join(x[0] for x in hist.most_common())
    return sorted_letters.split('_')[0]

print(message(letters, blob))
Collapse
 
rpalo profile image
Ryan Palo

Bonus! On my machine, this version is even 5 times faster than the best Ruby version!

cat test.py | python -m timeit
100000000 loops, best of 3: 0.00844 usec per loop
Thread Thread
 
imox2 profile image
Talha

This is great. Will use counter effectively. Thanks for showing me counter(). :-)

Collapse
 
edh_developer profile image
edh_developer

I wondered if I could do better performance-wise, based on knowledge of the problem that the more general built-in classes couldn't take advantage of. Ended up with this:

import string

longString="..."

def decode(msg):
    # Testing suggests this is a little faster than using defaultdict
    charCountHash = {}.fromkeys(string.ascii_lowercase,0)
    charCountHash['_'] = 0

    for ch in msg:
        charCountHash[ch] += 1

    c = charCountHash['_']
    resultHash = {charCountHash[k]: k for k in charCountHash.keys() \
        if charCountHash[k] > c }

    return ''.join( [ resultHash[index] for index in \
        sorted(resultHash.keys(), reverse=True) ] )

print(decode(longString))

Testing both implementations a few times suggests this way is faster, but only 1% or so.

Might try this in go, just for the practice...

Thread Thread
 
rpalo profile image
Ryan Palo

Nice! I haven't used fromkeys before. That's a nice alternative to defaultdict (and you don't have to import anything!) if you already know the keys you need.

Collapse
 
bengreenberg profile image
Ben Greenberg

Python is on my list of things to learn soon and can't help but remark how much it "looks" like Ruby without knowing much about it yet.

Collapse
 
imox2 profile image
Talha

"...without knowing much about it yet." This is great about python. Nice challenge this one. :-)

Collapse
 
buntine profile image
Andrew Buntine

Nice post. It's an interesting look at how algorithmic complexity can be trumped by real-world details.

But I think you'll find that the performance is not exactly in the each_char (which is probably lazily evalutated) but is caused by:

  • The setup/teardown (stack frames, etc) for the block that is passed to each_char
  • The repeated hash lookups (although they are supposed to be O(1))
Collapse
 
rpalo profile image
Ryan Palo

Thanks, that’s super interesting! Good to consider for future block usage. I can’t think of a way to speed up the block setup and tear down, but I wonder if I’d pick any performance up by switching the hash keys to symbols. I’ll have to do some ‘sperimenting 👍🏻

Collapse
 
tobias_salzmann profile image
Tobias Salzmann • Edited

Obvious (!!!) question to ask: What if the text does not fit into memory?

Distributed version in Scala with Spark:

def message(
  distributedBlob: RDD[Char], 
  chars: Set[Char] =  ('a' to 'z').toSet + '_'
)(implicit sc: SparkContext): String = {
  val frequencies = distributedBlob
    .filter(chars.contains)
    .countByValue()
  frequencies
    .toSeq
    .sortBy(- _._2)
    .map(_._1)
    .mkString
    .takeWhile('_' !=)
}
Collapse
 
mukeshbhardwaaj profile image
Mukesh Bhardwaj

Thanks for the amazing community. The way you welcome your new users are amazing. Hey Wait did you check this Solarmovies alternatives Thanks

Collapse
 
der_gopher profile image
Alex Pliutau

Interesting challenge. I want to see how it can be gracefully implemented in Go, even it will be more code than Ruby, but it can be elegant. github.com/plutov/practice-go/tree...

Collapse
 
techpout profile image
TechPout

I am deeply attracted by your post. It is really a nice and informative one. I will recommend it to my friends. Also, you can read this: Putlocker

Collapse
 
der_gopher profile image
Alex Pliutau

Here is solution for same problem in Go - github.com/plutov/practice-go/blob...

Collapse
 
nimuvea1 profile image
Nimuvea

Thanks for sharing this article. Also see the thetechbytes.net.

Collapse
 
jhoncarter profile image
Jhon Carter

Thank you for telling us but can you tell me how to add images in footer of this site: youtech.ooo/

Collapse
 
harshakon profile image
harshakon

want to Improve your grammer and fix other mistakes while writing here we have grammarly free trial to get premium features for free.