Quickstart
Last updated
Last updated
This page gives you all the information you need to get started validating your CSVs with CsvPath. Happily, it's not a lot. You will want to go deeper on other pages later.
If you need help getting started with Python, try Python.org's intros. Starting with a project tool like Poetry or Jupyter Notebooks can also help.
The CsvPath library is available through PyPI as csvpath
. At this stage, pre 1.0, the project changes frequently. You should pin the version you use but update it regularly.
If you are using Pip, install CsvPath with one of:
Both of the optional dependencies add a lot of bulk and platform considerations. Pandas is available for validating DataFrames. If you don't need to do that, don't bother installing with Pandas. Likewise, smart-open is used for reading files in s3. If you don't need it, don't install it.
You can check under the hood on Github. There are detailed docs that you can read in parallel or in addition to this site.
The main class in CsvPath is unsurprisingly called CsvPath. For simple jobs, it is all you need.
The quickest way to bootstrap a real CsvPath project is the command line interface (CLI). The CLI is a super simple tool for managing data files, CsvPath Language files, and results. It is barebones, but very productive. To try the CLI, skip over to Your First Validation, The Lazy Way. Or if you want to continue with the simplest possible Python, keep going.
To do a hello world you'll need to import CsvPath, create an instance, and point it at a CSV file.
Create a script file and do the import:
Create a test CSV file. Save it as trivial.csv or whatever name you like.
Make a csvpath. Also a trivial one, just to keep it simple.
This path says: open trivial.csv, scan all the lines, and match every one of them.
Here's copy-and-paste Python:
What does this script do?
Line 1 imports CsvPath so we can use it
Line 3 is our csvpath that we'll use to validate our test file, trivial.csv
Line 6 fast-forwards though the CSV file's lines. We could also step through them one by one, if we wanted to.
Line 8 checks if we consider the file valid. If the file didn't meet expectations our csvpath would have declared the file invalid using the fail()
function.
When you run your script you should see something like:
No question, that is far from impressive! Still, go with it. Small increments are good! You are now ready to dig in and see what CsvPath can really do.