Your First Validation, The Lazy Way
Can we make the first validation example even less taxing? Yes, probably. Let's try!
Last updated
Can we make the first validation example even less taxing? Yes, probably. Let's try!
Last updated
In the first two approaches to our super-simple first validation exercise the focus was on the CsvPath Language, but there was also a little Python to drive it. We can do without the Python by using the barebones CLI that comes with the CsvPath Library. Here's how.
We're going to use Poetry for our example project. You can learn how to set Poetry up here.
Open the command line and type this:
Change first_example
to any project name you like. You should see this:
cd
into your new project. Next, add csvpaths to your project with:
You should see this:
We can now run the CsvPath Library's CLI with:
If you're not a Poetry user, what we're doing is running a script defined in the pyproject.toml.
You can do the same with:
The CsvPath CLI is bare-bones. Don't expect much! Despite that, it is a useful way to do simple stuff fast. It is great for learning and basic CsvPath Language dev work.
You should see this:
You can select quit
, for now.
In your project dir, create a subdirectory called assets
, or whatever name you like. We'll drop the example CSV file and your csvpath file there. Create a file called first.csvpath
—or again, whatever name you like. Into it, paste the simplified version of the example csvpath statement:
Add the example delimited data in example.csv to the assets dir. Use the same trivial data set:
Now we're ready to run the validation. Fire up the CLI again with poetry run cli
. Select named-files
. You should see this:
Hit return on add named-file
. We're going to import your file into the FileManager
's files area. The file manager is used whenever one of your CsvPaths
instances needs to run a validation. When you hit return the CLI should ask you for a name for the file you are going to import:
Any name works. example
would be a good choice. You should then see a selection of dir
, file
, or json
.
Pick file
. Next you will select your file by drilling down into your assets
directory. Select your file and hit return.
Once your file is added you go back to the top menu. This time select named-paths
. And in the next submenu pick add named-paths.
Again, you enter a name for the csvpaths you are adding. first
would be a fine name. You next drill down to your first.csvpaths
file in the assets dir. Select it and hit return. You'll be taken back to the top menu.
Now you're ready to run your example. Select run
and hit return. You will be asked for the name of a file. Select your file's name, example
.
Next you'll be asked for the name of your csvpaths:
You have just one named-paths name, first
, so select that and hit return.
Now you get the question of what method you want to use to run your paths against your file. The options are fast forward or collect. As you may by now know, fast forward runs your validation, but doesn't collect the matching lines. Instead, it only collects variables, printouts, and errors. The collect method does collect the matching lines, in addition to variables, printouts, and errors.
As an aside, the library also allows you to step through a CSV path as its being validated, line-by-line. However, the CLI does not offer that option. You can easily do it programmatically using a CsvPaths
instance's next_paths()
method in a for line in csvpath.next_paths()
loop.
For our purposes, either method works. Pick collect
.
The CLI briefly tells you it is running. Then you're back at the top menu. You have successfully completed your first validation run. Congrats!
Now let's take a look at what resulted from our validation run. Select named-results
.
The CLI is so simple it can only open our results in your operating system's file browser. But that will do for learning and developing. Select open named-result
and select first
. A new window opens to your runs of the first
named-paths group runs. So far you have just one run. It should be timestamped for a minute ago.
Inside your first
run you should see these files:
data.csv
has all the lines from your example.csv
file with no changes made. Our validation matched all the lines and we used the collect method (technically, CsvPaths.collect_paths()
) so everything in the original file came through unchanged. errors.json
is empty because there were no errors. We didn't set any variables, so vars.json
is empty. And we didn't print anything as the run happened, so printouts.txt
is also empty. Not a lot to see, here, but we were expecting that, so it is a good thing.
There is a good amount of metadata in meta.json. If you open that file you should see something like this:
On line 16 you can see what file we used. It is the one you imported earlier. You can learn more about how the CsvPath Library manages files here. And read this page for more information about named-paths group validation results.
And that's it. Your first validation. Simplified and no Python code involved. Not bad!