Posts by: Israel


Exploring ACM

Almost a year ago, a friend of mine and I made an experiment, a visualization of the references between papers published in the ACM (Association of Computer Machinery) article library. Each paper is cited by other papers and so on, we represented those relationships in a directed graph: each node is a document and each edge is a reference from one to another. To make it happen, my friend Javier crawled the ACM library and got banned lots of times. He had to restart his modem so often that we had to “borrow” some computers at our university to run the crawler and get the info. After two days we had the disappointing amount of 10,000 articles. We expected many more, but the ACM anti-crawling rules got us fetching only 2 articles per minute. Anyway they were enough to play, so I made a tiny REST server to store the information and serve it to a web interface… the result was pretty cool! See it in action here. You can type and search for an article, or an author. Try “Bayesian” or “Policarpo”. Then the thing got better, we discovered a couple of “inconsistencies” within the library, at first we thought it was our fault, some errors in our DB maybe. But they were not. The DB was good and the crawler was good, here is what we found: Articles that refer to themselves Because f**k you that’s why. As you can see at the bottom of this post, ACM’s explanation involves errors in the Optical Character Recognition program they use to obtain the references from the article. However I wonder how could this happen by a mere OCR mistake. A few examples:   Articles that quote each other How about that! Such a thing should not be possible, there cannot be two papers based on each other, loops are not legal in this graph. However I was quite surprised about...

Read more


Uploading wordpress

This is a very specific yet common case: for some reason, you have to use Wordpress for something that is not a blog (of course it was not on you), so you start making custom plugins,  shaping up a theme with custom functions and filters, and after a few tests and proper changes on wp-config.php you proceed to upload everything to your server… aaand nothing works. A long ago, I had to figure it out myself and took me more than an hour, so I’m posting this step-by-step guide hoping to save you some time. Assuming you are working in your local machine, in a subfolder like israelcruz.com/mylocalwp, and want to upload your files to a production server like myproductiondomain.com,  right this way: Dump database $ mysqldump -u root -p my_local_database -r db.dump Replace each occurrence of ‘israelcruz.com/mylocalwp’ or whatever your local installation folder is, to the production domain name in the dump file you just created: $ mkdir export $ sed -i 's/israelcruz.com\/mylocalwp/myproductiondomain.com/g' db.dump > remote_db.dump Import the new database file remote_db.dump in your production database server using phpMyadmin, or ssh: $ mysql -uroot -p --default-character-set=utf8 my_remote_database $ SOURCE remote_db.dump Modify wp-config.php to match your remote MySQL settings: /** The name of the database for WordPress */ define('DB_NAME', 'my_remote_database'); /** MySQL database username */ define('DB_USER', 'my_remote_user'); /** MySQL database password */ define('DB_PASSWORD', 'my_remotepa55'); /** MySQL hostname */ define('DB_HOST', 'my_remote_host'); If you upload all your files it should work fine in your remote server, except for one thing; if you are using pretty URLs as in  myproductiondomain.com/my-post-name  you’ll get a 404 error when attempting to visit any page other than homepage, to fix it follow this final step: Modify .htaccess from this: <IfModule mod_rewrite.c> RewriteEngine On RewriteBase /mylocalwp/ RewriteRule ^index\.php$ - [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /mylocalwp/index.php [L] </IfModule> to this: <IfModule mod_rewrite.c> RewriteEngine On...

Read more


Randomness

Take a look at the following puzzle: “Given a function which produces a random integer from 1 to 5, write one that produces a random integer from 1 to 7″ I immediately thought of this: ... function naiveRand(){ return (given() + given()) % 7 + 1; } ... It is obviously not random enough, you can guess that there are more chances of getting 7 than the other numbers, the number of combinations is 25, each one of these resulting in an integer between 1 and 7,  but there are more ways of getting 7 than any other number, see: 1 : 4 chances: 5+2, 2+5, 3+4, 4+3 2 : 3 chances: 3+5, 5+3, 4+4 3 : 3 chances: 4+5, 5+4, 1+1 4 : 3 chances: 1+2, 2+1, 5+5 5 : 3 chances: 3+1, 1+3, 2+2 6 : 4 chances: 1+4, 4+1, 2+3, 3+2 7 : 5 chances: 5+1, 1+5, 4+2, 2+4, 3+3 More graphically: Recalculate That is clearly not the picture of a uniform distribution, so what if we add few more invocations to given()? Something like this would do the trick: function newRand(){ return ( given() + given() + given() + given() + given() + given() ) % 7 + 1; } Recalculate The number of combinations is 5^6 (15625) and as you can see in the graphic above, it seems pretty random, but if you count the number of opportunities to get each integer from 1 to 7 you’ll find this: 1 : 2229 2 : 2224 3 : 2224 4 : 2229 5 : 2238 6 : 2243 7 : 2238 Number 6 is slightly more likely to come out. So how can this task be achieved? I spent around an hour trying to think of a better solution, but I couldn’t. So I googled the question and I...

Read more


The Zoo problem

I just discovered TopCoder, and I got hooked. I’m going to post here some of of the easiest archived problems and their solutions (I’ll start with the hard ones later). This one is called simply “Zoo” and it is about a fox that wants to know if zoo’s rabbits are taller than cats. This guy is kind of blind, so he can’t distinguish between rabbits and cats or between a shorter and taller animal either. But he can ask each animal the following question: “How many animals of the same kind as you are taller than you?” There is another important thing: each animal knows if another animal of the same kind is taller or not, in other words: there aren’t two cats of the same height and there aren’t two rabbits of the same height. The problem also states the following: “There are N animals numbered 0 to N-1″ “Each animal is a rabbit or a cat” ” The answer given by the i-th animal is answers[i]“ You may think that the little fox won’t be able to determine if cats are taller than rabbits just by knowing the answers from each animal, and that’s true, he will only know how many possible “configurations” are there. So we should provide him that; the number of possible configurations. The full description of the problem is here. So we are given an array of integers such that each index i of the array is the name of an animal and the value in that position is i’s answer. That said, if we have an input like [0,1] we can say that both animal 0 and 1 are rabbits or cats, because if animal 0 was a cat and animal 1 a rabbit, animal 1 would be lying  (saying that there is another cat taller than her). So we...

Read more


Hello Internet!

I finally got convinced of writing a blog, I’ve been solving some puzzles and small programming challenges, so I thought it would be better to write the answers and notes in a blog instead of keeping them under a dusty folder somewhere on my hard drive. Please feel free to comment; any corrections will be highly appreciated.

Read more