Python — List Sorting, Keys & Lambdas

I had a CSV file containing timestamped data that I wanted to plot from oldest to most recent but unfortunately it was unsorted. I could easily read the data from the file and I knew how to use matplotlib to plot but I wasn?t sure how to sort the data.

I was reading the data from the file, converting the timestamp to a datetime object and then storing it with the other data in a list similar to this:

list = [[datetime.datetime(2016, 7, 10, 0, 57, 54), ‘2.61’], [datetime.datetime(2016, 7, 10, 0, 58, 14), ‘2.68’], … [datetime.datetime(2016, 7, 10, 0, 58, 34), ‘2.61’], [datetime.datetime(2016, 7, 10, 0, 58, 54), ‘2.59?]]

Built-in List Sorting

After a quick search I came across the Sorting Mini-How To guide which has some good info. Turns out Python lists have two built-in ways to sort data:

  1. sort() ? A method that modifies the list in-place
  2. sorted() ? A built-in function that builds a new sorted list from an iterable

The guide has lots of info for both options, I went with sort() as I didn?t need to keep the original data. To sort all I needed to do was:

list.sort(key=lambda r:r[0])

This easily sorts the list by datetime which was cool but I wasn?t sure what the whole key and lambda thing was all about.

Key Parameter

Referencing the How To guide again it states:

?both list.sort() and sorted() added a key parameter to specify a function to be called on each list element prior to making comparisons. The value of the key parameter should be a function that takes a single argument and returns a key to use for sorting purposes. This technique is fast because the key function is called exactly once for each input record.?

It gives an example of using the str.lower function to make a case-insensitive string comparison. That all makes sense but what is lambda?!

Lambda Functions

A lambda is an anonymous function and an anonymous function is a function that is defined without a name, this post seems to explain it pretty nicely.

Lambda functions are nice for calling in-line because they only have one expression which is evaluated and returned. They syntax for a lambda is:

lambda arguments: expression

Putting it all together

So when I used ? list.sort(key=lambda r:r[0])lambda r:r[0] is an anonymous function with a single argument, r which would in this case was a list, e.g.: [datetime.datetime(2016, 7, 10, 0, 57, 54), ?2.61?]the lambda then returns the first element of the list, in this case the element that corresponds to the datetime object. This is then used as the key for the sort.

The final code looked like this:

import traceback, matplotlib.pyplot as pltfrom datetime import datetimef = open(?Data.csv?,?r?)results = downLines = f.readlines()f.close()for line in downLines: try: splitLine = line.split(?,?) dateTime = datetime.strptime(splitLine[0], ? %m/ %d/%Y %I:%M:%S %p?) results.append([dateTime, splitLine[1], splitLine[2], splitLine[3]]) except: print ?Error:? print traceback.format_exc() continueresults.sort(key=lambda r: r[0])downDate = [x[0] for x in results]downData = [y[1] for y in results]plt.plot(downDate, downData, label=?Data?)plt.legend()plt.show()

No Responses

Write a response