Raw cookie dough in cookie clumps

Back in 2007, Dimitry worked on a website that received millions of unique visitors.

One day, a partnered ad agency provided a large PHP script that they wanted deployed as part of a new implementation. Given the popularity of the website, it was important that any code associated with it was either cached or light. This beast was neither. But Dimitry wasn't to ask questions, only test and deploy.

Minutes after the code went live in production, Dimitry's manager barreled into his cube. "It's all frozen!"

"Wha?" Dimitry choked out around a celebratory sip of high-caffeine soda, which quickly went rancid in his mouth.

"The site is down! Roll back! Roll back!"

Rolling back was the easy part. Now, Dimitry had to figure out what had gone wrong. That meant taking a closer look at the ad agency's code, and facing the nightmare within.

Here was a representative function. Its purpose was to read server-side cookie information and track which ads a user had seen. No user would be presented the same ad twice in a row unless the config file specified otherwise.


function _getcookie($config) {
         $ip = preg_replace("/\./", "", $_SERVER['REMOTE_ADDR']);
         $agent = preg_replace("/\s/", "", $_SERVER['HTTP_USER_AGENT']);
         $cstring = base64_encode($ip . $agent);

        $handle = opendir($config["cookies"]);
        $filelist = array();
        $i = 0;
        while ($file = readdir($handle)) {
                if ($file!="." && $file!=".." && $file!="Thumbs.db") {
                        $filelist[$i] = $file;
                        $i++;
                }
        }

        $cookielist = array();
        $i = 0;
        foreach ($filelist as $key => $value) {
                 if (substr($value, 0, strlen($cstring))==$cstring) {
                          $cookielist[$i] = $value;
                          $i++;
                 }
        }

        $cookies = array();
        foreach ($cookielist as $key => $value) {
                 $cookietemp = explode("%", $value);
                 $cookieid = preg_replace("/\.dat/", "", $cookietemp[1]);
                 unset($cookietemp);
                $cookies[$cookieid] = file_get_contents($config["cookies"] . $value);
        }

        return $cookies;
}

Dimitry tried firing up an FTP client and checking the file system where the cookie data was stored—only to have his client crash and die on him. The problem? The sheer number of files contained within that directory.

At that point, Dimitry realized that each cookie was stored as a separate file for each key. Additionally, all files in the directory were read into memory for each request to the above-mentioned function. Over 100,000 files were being scanned, read into memory, looped over twice, then dismissed.

And this was just one function out of many that did similar things. Dimitry and his colleagues had some recoding to do.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!