Data Models
This section documents the key data models used throughout the RepoPacker.jl package.
RepoPacker.add_extension — Methodadd_extension(ext::AbstractString)Add a file extension (e.g., ".r", ".sql") to the list of recognized text file extensions. The extension should include the leading dot.
Arguments
ext: File extension to add (must start with a dot)
Examples
RepoPacker.add_extension(".r")
RepoPacker.add_extension(".sql")Errors
- Throws
ArgumentErrorif extension doesn't start with a dot
RepoPacker.calculate_file_metrics — Methodcalculate_file_metrics(text_files::Vector{String}, base_dir::AbstractString)Calculate metrics for a collection of text files.
Returns
- Tuple containing:
- total_chars: Total character count
- total_tokens: Total token estimate
- filecharcounts: Dict mapping file paths to character counts
- filetokencounts: Dict mapping file paths to token estimates
RepoPacker.clone_and_pack — Functionclone_and_pack(repo_url::AbstractString, output_file::AbstractString="repo.xml";
output_style::Symbol=:xml, temp_dir::AbstractString=tempname(), verbose::Bool=false)Clone a GitHub repository and pack its text files into a file in the specified format.
Arguments
repo_url: URL of the Git repository to cloneoutput_file: Output file path (default: "repo.xml")output_style: Format to use (:xml,:json, or:markdown)temp_dir: Temporary directory for cloning (default: auto-generated)verbose: Whether to enable detailed logging (default: false)
Returns
- Path to the generated output file
Examples
RepoPacker.clone_and_pack("https://github.com/username/repo.git", "output.xml")Errors
- Throws errors related to Git operations or file writing
RepoPacker.collect_text_files — Methodcollect_text_files(dir_path::AbstractString; verbose::Bool=false)Recursively collect all text files in a directory, skipping .git and neglected paths.
Arguments
dir_path: Directory path to scanverbose: Whether to enable detailed logging (default: false)
Returns
- Vector of text file paths
Examples
files = RepoPacker.collect_text_files(".")Errors
- Throws
ArgumentErrorif directory doesn't exist
RepoPacker.estimate_token_count — Methodestimate_token_count(content::String)Estimate token count using the simple heuristic: length(content) ÷ 4. This approximates GPT-style tokenizers for English-like code/text.
Returns
- Estimated token count as
Int
RepoPacker.generate_json_content — Methodgenerate_json_content(text_files::Vector{String}, base_dir::AbstractString; verbose::Bool=false)Generate repository content in Repomix-inspired JSON format.
Returns
- String containing valid JSON with keys:
fileSummary,directoryStructure,files,metrics
RepoPacker.generate_markdown_content — Methodgenerate_markdown_content(text_files::Vector{String}, base_dir::AbstractString; verbose::Bool=false)Generate repository content in Repomix-inspired Markdown format.
RepoPacker.generate_xml_content — Methodgenerate_xml_content(text_files::Vector{String}, base_dir::AbstractString; verbose::Bool=false)Generate XML content in Repomix format.
Arguments
text_files: Vector of text file paths to includebase_dir: Base directory of the repositoryverbose: Whether to enable detailed logging (default: false)
Returns
- String containing the XML content with header
Examples
files = RepoPacker.collect_text_files(".")
xml_content = RepoPacker.generate_xml_content(files, ".")RepoPacker.get_directory_structure — Methodget_directory_structure(dir_path::AbstractString; verbose::Bool=false)Generate a visual representation of the directory structure, excluding neglected paths.
Arguments
dir_path: Directory path to analyzeverbose: Whether to enable detailed logging (default: false)
Returns
- String representation of the directory structure
Examples
structure = RepoPacker.get_directory_structure(".")
println(structure)RepoPacker.get_top_files — Functionget_top_files(file_token_counts::Dict{String, Int}, n::Int=5)Get the top N files with the highest token counts.
Returns
- Vector of tuples (filepath, tokencount) sorted by token count descending
RepoPacker.is_text_file — Methodis_text_file(path::AbstractString)Check if a file is a text file based on its extension.
Arguments
path: File path to check
Returns
trueif the file has a recognized text extension,falseotherwise
Examples
RepoPacker.is_text_file("src/RepoPacker.jl") # returns true
RepoPacker.is_text_file("docs/logo.png") # returns falseRepoPacker.neglect_path — Methodneglect_path(path::AbstractString)Add a path (file or directory) to be excluded from packing. Paths are matched as substrings in the full file path (relative to repo root).
Arguments
path: Path pattern to exclude (can be relative or absolute)
Examples
RepoPacker.neglect_path("test/")
RepoPacker.neglect_path(".env")RepoPacker.pack_directory — Functionpack_directory(dir_path::AbstractString, output_file::AbstractString="repo.xml";
output_style::Symbol=:xml, verbose::Bool=false)Pack the text files in a directory into a file in the specified format.
Arguments
dir_path: Directory path to packoutput_file: Output file path (default: "repo.xml")output_style: Format to use (:xml,:json, or:markdown)verbose: Whether to enable detailed logging (default: false)
Returns
- Path to the generated output file
Examples
RepoPacker.pack_directory(".", "repo.xml")Errors
- Throws
ArgumentErrorif directory doesn't exist
RepoPacker.should_neglect — Methodshould_neglect(full_path::AbstractString, base_dir::AbstractString)Check if a file should be excluded based on the global NEGLECTPATHS list. Compares against both absolute and relative (to basedir) paths.
Arguments
full_path: Absolute path of the filebase_dir: Base directory of the repository
Returns
trueif the file should be excluded,falseotherwise