> For the complete documentation index, see [llms.txt](https://www.csvpath.org/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://www.csvpath.org/topics/higher-level-topics/organizing-inbound-data/the-three-data-spaces/source-staging.md).

# Source Staging

<figure><img src="/files/GCzwuPWchQz3vDpqqBRO" alt="" width="375"><figcaption><p>The source staging area is for named versions of raw inbound files</p></figcaption></figure>

CsvPath Framework collects all inbound files into a staging area. This area is:&#x20;

* A permanent immutable record of all versions of inbound files
* The source for the validation and upgrading engine
* Available for inspection by individuals triaging downstream problems
* Accessible by any systems that don't want the validation, upgrading, and metadata that CsvPaths Framework runs offer. *(We anticipate the number of such ambivalent systems is approximately 0, but, still, the access to raw source files is available)*

The source staging area can mirror any current directory layout. The "Path to file" box in the diagram above represents any file system structure you like. The structure is defined on a named-file by named-file basis using a template. We cover templates later in this documentation.

The "File name (as a directory)" box is just what it says: a directory named for a source file. E.g. if an inbound raw source file is named `2025-apr-01-sales-emea.csv`, it lives in a directory named `2025-apr-01-sales-emea.csv`.&#x20;

The actual file's bytes live in files named by SHA256 hash values. These hash fingerprints are unique to the exact content of a version of the file. If a new copy of `2025-apr-01-sales-emea.csv` arrives a day later with 1 character different from the original file, CsvPath Framework stores the new version in a file named by the new unique hash of the new content.

The named-file name is an abstract name like `orders` or `EMEA-orders` or `Q2-orders-Acme-EMEA`. It is whatever you like. The path within the named-file is constructed according to a template that is based on the path where MFT received the file. That means there can be multiple paths within the named-file name. Likewise, the name of the data file is likely to change. CsvPath Framework captures the new name and its new hash fingerprint.&#x20;

The abstract named-file name can be used stand-alone in starting a run. When you do that, CsvPath assumes you mean the most recent file that was registered with that name.&#x20;

Alternatively you can refer to a named-file name with the full path to the filename. You can also use a partial path to find one or more files. A partial path can have pointers to dynamically find a version of one or more files registered with the named-file name at a location and/or within an arrival window.&#x20;

We will explain how this flexibility works and is helpful later in these docs.

Finally, the named-file directory contains a `manifest.json` that tracks arrival times, identities, and other automatically generated metadata.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://www.csvpath.org/topics/higher-level-topics/organizing-inbound-data/the-three-data-spaces/source-staging.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
