The Problem
Most of us know how to parse a CSV file, but what about other sources such as a text block. For example, given the following block of text:
text = """ uid,alias,shell 501,karen,bash 502,john,tcsh """
Enter fullscreen mode Exit fullscreen mode
How do we parse this text. Should we write it to a file first, then read?
The Solution
According to the documentation for the csv library, a CSV reader can handle a file, or anything that supports the iterator protocol, which includes list, or in-memory file objects. Let us explore the first solution, which splits the text into lines.
import csv
text = """ uid,alias,shell 501,karen,bash 502,john,tcsh """
reader = csv.reader(text.strip().splitlines())
for row in reader:
print(row)
Enter fullscreen mode Exit fullscreen mode
['uid', 'alias', 'shell']
['501', 'karen', 'bash']
['502', 'john', 'tcsh']
Enter fullscreen mode Exit fullscreen mode
That was easy. Note that we called .strip()
to remove whitespaces surrounding the text block before splitting it into individual lines. Next, we feed the lines to csv.reader
and let it does its work.
Alternatively, we can use io.StringIO
to turn the text into an in-memory file:
import csv
import io
text = """ uid,alias,shell 501,karen,bash 502,john,tcsh """
in_memory_file = io.StringIO(text.strip())
reader = csv.reader(in_memory_file)
for row in reader:
print(row)
Enter fullscreen mode Exit fullscreen mode
['uid', 'alias', 'shell']
['501', 'karen', 'bash']
['502', 'john', 'tcsh']
Enter fullscreen mode Exit fullscreen mode
Of course, these techniques also work with other kind of reader: csv.DictReader
:
import csv
import io
text = """ uid,alias,shell 501,karen,bash 502,john,tcsh """
in_memory_file = io.StringIO(text.strip())
reader = csv.DictReader(in_memory_file)
for row in reader:
print(row)
Enter fullscreen mode Exit fullscreen mode
OrderedDict([('uid', '501'), ('alias', 'karen'), ('shell', 'bash')])
OrderedDict([('uid', '502'), ('alias', 'john'), ('shell', 'tcsh')])
Enter fullscreen mode Exit fullscreen mode
Conclusion
Parsing a block of CSV text is not that hard, you do not need to write it to an external file and that is the beauty of the csv
library: it can work with a number of input sources, not just file.
暂无评论内容