Data Models
This section documents the data models and structures used throughout the EVE Pipeline.
Document Model
The primary data structure representing documents in the pipeline.
Unified document object that encapsulates content and metadata throughout the pipeline.
This replaces the need to pass (Path, str) tuples and provides a consistent interface for document handling across all pipeline stages.
Attributes:
| Name | Type | Description |
|---|---|---|
content |
str
|
The actual document text content |
file_path |
Path
|
Path to the source file |
file_format |
str
|
Format of the source file (pdf, md, html, etc.) |
metadata |
Dict[str, Any]
|
Original metadata from the document (preserved from source) |
embedding |
Optional[List[float]]
|
Optional embedding vector for the document |
pipeline_metadata |
Dict[str, Any]
|
Metadata added by pipeline steps (filters, processing, etc.) |
Source code in eve/model/document.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |
content_length
property
Get the length of the content.
extension
property
Get the file extension.
filename
property
Get the filename without path.
__repr__()
Detailed representation.
Source code in eve/model/document.py
123 124 125 | |
__str__()
String representation showing filename and content length.
Source code in eve/model/document.py
119 120 121 | |
add_metadata(key, value)
Add an entry to the original metadata.
Source code in eve/model/document.py
69 70 71 | |
add_pipeline_metadata(key, value)
Add an entry to pipeline metadata (for tracking pipeline processing).
Source code in eve/model/document.py
77 78 79 | |
from_path_and_content(file_path, content, **metadata)
classmethod
Create a Document from a file path and content string.
Source code in eve/model/document.py
102 103 104 105 106 107 | |
from_tuple(path_content_tuple, **metadata)
classmethod
Create a Document from a (Path, str) tuple for backwards compatibility.
Source code in eve/model/document.py
109 110 111 112 113 | |
get_metadata(key, default=None)
Get a value from the original metadata with optional default.
Source code in eve/model/document.py
73 74 75 | |
get_pipeline_metadata(key, default=None)
Get a value from pipeline metadata with optional default.
Source code in eve/model/document.py
81 82 83 | |
is_empty()
Check if the document content is empty.
Source code in eve/model/document.py
65 66 67 | |
to_tuple()
Convert to (Path, str) tuple for backwards compatibility.
Source code in eve/model/document.py
115 116 117 | |
update_content(new_content)
Update the document content and track the change in metadata.
Source code in eve/model/document.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |
Configuration Models
Data models for pipeline configuration.
Inputs Configuration
Bases: BaseModel
Source code in eve/config.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | |
Pipeline Configuration
Bases: BaseModel
Source code in eve/config.py
25 26 27 28 29 30 31 32 33 34 35 36 | |