The Modes

In the context of a CsvPaths instance's run, an individual CsvPath instance can operate in several possible modes that allow you to configure its behavior without resorting to the global config.ini or applying settings programmatically. In particular, the modes help you configure groups of csvpaths more flexibly. You can use them to easily disable individual csvpaths or configure them differently than other csvpaths in the same named-paths group.

Modes are set in your csvpath's comments. The modes are:

error-mode: [bare / full]
explain-mode: [explain / no-explain]
files-mode: (all or any combination of)
- all
- data / no-data
- unmatched / no-unmatched
- printouts / no-printouts
logic-mode: [AND / OR]
print-mode: [default / no-default]
return-mode: [matches / no-matches]
run-mode: [run / no-run]
source-mode: preceding
transfer-mode: data / unmatched > var-name
unmatched-mode: [keep / no-keep]
validation-mode: (any combination of)
- print / no-print
- raise / no-raise
- stop / no-stop
- fail / no-fail
- collect / no-collect
- match / no-match

Modes are only set in external comments. External comments are comments that are outside the csvpath, above or below it. External comments can also have other user-defined metadata and plain text mixed in with mode settings. If a mode setting is followed by plain text there must be a stand-alone colon between the mode and the text.

Defaults

When a mode is not explicitly set CsvPath uses sensible defaults. Some modes default to options set in config/config.ini. For example, validation-mode overrides [errors] csvpath in config.ini. (Read here for more about the config file.) Other defaults are built-in, for instance, logic-mode overrides the library's built-in default matching using ANDed operations. The defaults are:

error-mode: defaults to bare, meaning error() and built-in errors are presented minimally
explain-mode: no explanations are logged when logging is set to INFO
files-mode: there is no check for optional files having been generated
logic-mode: match components are ANDed
print-mode: print statements go to the console
return-mode: matches are returned
run-mode: the csvpath is run
source-mode: the named-file that was passed to the named-paths group is used as input
transfer-mode: no result data transfer is made
unmatched-mode: the lines not returned are discarded
validation-mode: validation errors are only printed and logged

An Example

These settings are configured like in this example of two trivial csvpaths in a named-paths group called example:

~
   id: hello_world
   run-mode: no-run
~
$[*][ yes() ]

---- CSVPATH ----

~  
   id: next please!
   explain-mode: explain
   validation-mode: no-raise, print
   logic-mode: OR
   return-mode: matches
   unmatched-mode: keep
   print-mode: default :
   All of these mode settings are optional, of course! And they don't have to be written as neatly as this, either.   
~
$[*][
   import($example.csvpaths.hello_world)
   yes()
]

hello_world will not be run when the named-paths group runs, but it will be imported into the second csvpath identified as next please!. This example doesn't do much, but it gives an idea of how you can easily configure individual csvpaths within a group that will be run as a single unit. As you can see, some modes can take multiple values separated by commas.

Detailed Descriptions

Run Mode

Setting

no-run

The csvpath will not be run on its own. It only runs as an import into another csvpath that is runnable.

run

Run is the default.

Validation Mode

Validation mode controls how the CsvPath instance reacts to built-in validation errors. Built-in validation errors have two types:

Problems with the csvpath's syntax or structure
Problems with the data being validated

Setting

raise

The setting raise indicates that when a validation problem occurs, an exception should be raised that will likely halt the program. The opposite is no-raise. Setting neither value defaults the decision back to the global config.ini setting.

print

The print setting makes the CsvPath instance print validation messages to all configured Printer instances. The opposite is no-print.

stop

The stop mode setting makes the CsvPath instance stop as soon as a validation problem occurs. no-stop prevents this premature completion, enabling the CsvPath instance to alert and continue.

fail

The fail setting sets the csvpath being run to invalid. Effectively this means setting the CsvPath instance's is_valid property to False. The opposite setting is no-fail. Failing has no effect on the program or the validation run continuing.

match

When match is set a built-in validation error will match, rather than fail to match. The thing to remember is that this setting applies to errors in the data (e.g. adding "five", not 5) only. Errors in the CsvPath Language are still not allowed. As a practical example add("five", 5) never works, but add(@five, 5) always does because even if @five turns out to not be a number on a particular line we still match on it in accordance with this setting. Regardless of if you set match or not, if you don't have no-raise, your csvpath will blow-up on validation errors.

collect

When collect is set errors are captured. When no-collect is set they are dropped. You can drop errors and still fail a file to make it invalid; just as you can capture errors but choose to not use fail(). Keep in mind that when you don't collect errors CsvPath.has_errors() is False. Also bear in mind that if you are using the OpenTelemetry integration (e.g. to push events to Grafana, New Relic, etc.) you can choose to drop errors but still fire error events.

Logic Mode

AND

AND is the default logic mode. It requires that all match components evaluate to True for a line to match.

OR

OR mode is similar to how the or() function works. Any match component that evaluates to true makes the line match.

Return Mode

Setting

matches

All the matching lines will be returned by next() or collect(). (fast_forward() never returns lines, regardless of mode). This is the default behavior.

no-matches

All the lines that fail to match will be returned.

Print Mode

CsvPath supports printing errors and user-defined messages to any number of Printer objects using the print() and error() functions. Printers send text to separate queues. By default a "standard out" printer is enabled that prints to the console, as well as to a file. If you don't want anything printed to the console you would set no-default.

Setting

default

When default is set the CsvPath instance prints to the console, as well as any other Printer instances you configure.

no-default

When no-default is set the standard console printer is disabled.

Explain Mode

Setting

explain

When set a step-by-step explanation of the values, assignments, match, etc. are dumped to INFO for each line in the file being processed. This can be a good aid to debugging but is performance expensive. The hit can be around 20-25%.

no-explain

no-explain is the default.

Unmatched Mode

Setting

keep

Return mode determines if matches or non-matches are returned. Unmatched mode determines if the non-returned lines are kept available in the Result instance or on the CsvPath instance. If the lines are kept and you are using a CsvPaths instance, the Result instance will be serialized to the archive directory and you will see an unmatched.csv file containing the lines.

no-keep

No lines that were not returned are kept.

Files Mode

The impact of files-mode is that the run instance manifest and the csvpath's manifest will show that files were created as expected, or not.

There are various reasons why printouts.txt, data.csv and unmatched.csv might not be generated. For e.g., if we expect no validation output from user-created print() statements or built-in validation error messages we might set the files-mode to no-printouts. If a validation error was then printed we would be alerted in the metadata. In another example, if we set unmatched-mode to no-keep (the default) and files-mode to unmatched we have a conflict that we'll be alerted to in the metadata. Similarly, if we set files-mode to data and then run fast_forward_paths() we will not get data.csv files and the metadata will alert us to the mismatch.

errors.json, vars.json, meta.json, and manifest.json are always generated, regardless of files-mode. When you set files-mode to all the CsvPath Library will double-check that meta, vars, errors were correctly created, but that part of its checking is superfluous.

Setting

all

All file types are expected to be generated

data / no-data

Determines if the data.csv file is expected

unmatched / no-unmatched

Determines if the unmatched.csv file is expected

printouts / no-printouts

Determines if we expect anything to be sent to the Printer instances using print()

Source Mode

Usually the data for a csvpath in a named-paths group comes from the data input for the whole group. I.e., all the csvpaths in the group run against the same source file. However, in some cases you might want the input to a csvpath to be the csvpath preceding it. Meaning that the results captured from the first csvpath are piped into the second. To do this, you set source-mode: preceding on the second csvpath.

Keep in mind that CsvPaths instances' _collects methods and _by_line methods are quite different in how they handle data sources. Source mode does not apply to by-lines runs—i.e. it is for linear, not breadth-first runs—because in a by-lines run each line is passed through each of the csvpaths in the named-paths group before the next line is considered. Csvpaths in a by-lines run can change data for downstream csvpaths in their named-paths group, and they can skip or advance the run in order to filter data so that downstream csvpaths don't have a chance at it. This just means that there are multiple ways of allowing earlier csvpaths to have an effect on later csvpaths.

Source mode has a lot to do with rewind/replay, also references between data sets, as well as strategies for validation and canonicalization.

Setting

preceding

Instructs the csvpath to use the output of the preceding csvpath in the named-paths group as its input data

Transfer Mode

transfer-mode let's you copy data.csv or unmatched.csv to an arbitrary location in the transfers directory. The transfers directory is configured in config/config.ini under [results] transfers. To use transfer-mode you use the form data | unmatched > var-name where var-name is the name of a variable that will be the relative path under the transfer directory to the data you are transferring. Note that transfer-mode has no effect on the original data, in keeping with CsvPath Library's copy-on-write semantics. You may have as many transfers as you like by separating them with commas. Read more about using transfer-mode here.

Setting

data > var-name

Indicates you are transferring data.csv to the value of var-name as a relative path within the transfer directory

unmatched > var-name

Indicates unmatched.csv to the value of var-name

Error Mode

error-mode allows you to output errors with log-like information or as plain plain messages.

Setting

bare

Errors are output as simple strings

full

Errors are output according to the [errors] pattern config value using the following fields:

time: Time
file: Named-file name
line: Line number
paths: Named-paths name
instance: Csvpath instance ID/name
chain: Match component chain
message: Message

The default pattern is:

{time}:{file}:{line}:{paths}:{instance}:{chain}: {message}

The chain field gives the parent-child relationships from the top match component to the match component child that was the source of the error.

PreviousThe Collect, Store, Validate Pattern NextThe Reference Data Types

Last updated 6 months ago