After his boss left the company, Joel C was promoted to team lead. This meant that Joel was not only responsible for their rather large production codebase, but also for interviewing new potential team members. There are a ton of coding questions that one can ask in a technical interview, and Joel figured he should ask one that they actually solve in their application: given two unordered sets of timestamps, calculate how much overlap (if any) is between the two series.

If you think about it for a minute, it's really quite simple: first, find the minimum and maximum values for each set to get the start and end times (e.g. [01:08:01,01:09:55] and [01:04:11,01:09:42]). Then, subtract the later start time (01:08:01) from the earlier end time (01:09:42) to get the overlap (01:09:42 - 01:08:01 = 00:01:41). A non-positive result would indicate there's no overlap (such as 12:00:04 - 13:11:43), and in that case, it should probably just be zero. Or, in a single line of code:

return max(min(max(a), max(b)) - max(min(a), min(b)), 0)

Of course, something more spaced out might help with readability, but Joel saw a lot of candidates overthink the problem. They would sort the lists, create unneeded temporary variables, not understand that they really only need the first and last elements of the list, etc. In many of those cases, Joel judged candidates quite harshly; it's a simple problem and if this confuses them, how could they handle more complex problems?

A handful of candidates recognized the problem for how simple it was, but one went so far as to ask, "there are a ton of ways to solve this in code; here's my solution, but I'm really curious how you solved it?"

Joel wasn't really sure how his former boss solved the problem. After spelunking through the codebase, he found out:

def compute(dev_a, dev_b):
    labeled_timestamps = []
    for label, dev in ('a', dev_a), ('b', dev_b):
        for t in dev.timestamps:
            labeled_timestamps.append([t, label])
    labeled_timestamps.sort()

    last_label = None
    start_overlap = None
    end_overlap = None
    last_t = None
    for t, label in labeled_timestamps:
        if last_label is not None and last_label != label:
            if start_overlap is None:
                start_overlap = t
            else:
                end_overlap = last_t
        last_label = label
        last_t = t

    if end_overlap is None:
        end_overlap = start_overlap
    overlap_time = pd.Timedelta(end_overlap - start_overlap, 's')

"I have seen some overly complex answers," Joel wrote, "but this is a level of overthinking that is beyond impressive. If someone had given me this in an interview, I would probably have been left completely dumbfounded, and submitted this as a Tales from the Interview. Instead... I just shut down my computer and started my weekend drinking a little early."

[Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!