Data Models

This section documents the key data models used throughout the RepoPacker.jl package.

RepoPacker.add_extensionMethod
add_extension(ext::AbstractString)

Add a file extension (e.g., ".r", ".sql") to the list of recognized text file extensions. The extension should include the leading dot.

Arguments

  • ext: File extension to add (must start with a dot)

Examples

RepoPacker.add_extension(".r")
RepoPacker.add_extension(".sql")

Errors

  • Throws ArgumentError if extension doesn't start with a dot
source
RepoPacker.calculate_file_metricsMethod
calculate_file_metrics(text_files::Vector{String}, base_dir::AbstractString)

Calculate metrics for a collection of text files.

Returns

  • Tuple containing:
    • total_chars: Total character count
    • total_tokens: Total token estimate
    • filecharcounts: Dict mapping file paths to character counts
    • filetokencounts: Dict mapping file paths to token estimates
source
RepoPacker.clone_and_packFunction
clone_and_pack(repo_url::AbstractString, output_file::AbstractString="repo.xml"; 
               output_style::Symbol=:xml, temp_dir::AbstractString=tempname(), verbose::Bool=false)

Clone a GitHub repository and pack its text files into a file in the specified format.

Arguments

  • repo_url: URL of the Git repository to clone
  • output_file: Output file path (default: "repo.xml")
  • output_style: Format to use (:xml, :json, or :markdown)
  • temp_dir: Temporary directory for cloning (default: auto-generated)
  • verbose: Whether to enable detailed logging (default: false)

Returns

  • Path to the generated output file

Examples

RepoPacker.clone_and_pack("https://github.com/username/repo.git", "output.xml")

Errors

  • Throws errors related to Git operations or file writing
source
RepoPacker.collect_text_filesMethod
collect_text_files(dir_path::AbstractString; verbose::Bool=false)

Recursively collect all text files in a directory, skipping .git and neglected paths.

Arguments

  • dir_path: Directory path to scan
  • verbose: Whether to enable detailed logging (default: false)

Returns

  • Vector of text file paths

Examples

files = RepoPacker.collect_text_files(".")

Errors

  • Throws ArgumentError if directory doesn't exist
source
RepoPacker.estimate_token_countMethod
estimate_token_count(content::String)

Estimate token count using the simple heuristic: length(content) ÷ 4. This approximates GPT-style tokenizers for English-like code/text.

Returns

  • Estimated token count as Int
source
RepoPacker.generate_json_contentMethod
generate_json_content(text_files::Vector{String}, base_dir::AbstractString; verbose::Bool=false)

Generate repository content in Repomix-inspired JSON format.

Returns

  • String containing valid JSON with keys: fileSummary, directoryStructure, files, metrics
source
RepoPacker.generate_markdown_contentMethod
generate_markdown_content(text_files::Vector{String}, base_dir::AbstractString; verbose::Bool=false)

Generate repository content in Repomix-inspired Markdown format.

source
RepoPacker.generate_xml_contentMethod
generate_xml_content(text_files::Vector{String}, base_dir::AbstractString; verbose::Bool=false)

Generate XML content in Repomix format.

Arguments

  • text_files: Vector of text file paths to include
  • base_dir: Base directory of the repository
  • verbose: Whether to enable detailed logging (default: false)

Returns

  • String containing the XML content with header

Examples

files = RepoPacker.collect_text_files(".")
xml_content = RepoPacker.generate_xml_content(files, ".")
source
RepoPacker.get_directory_structureMethod
get_directory_structure(dir_path::AbstractString; verbose::Bool=false)

Generate a visual representation of the directory structure, excluding neglected paths.

Arguments

  • dir_path: Directory path to analyze
  • verbose: Whether to enable detailed logging (default: false)

Returns

  • String representation of the directory structure

Examples

structure = RepoPacker.get_directory_structure(".")
println(structure)
source
RepoPacker.get_top_filesFunction
get_top_files(file_token_counts::Dict{String, Int}, n::Int=5)

Get the top N files with the highest token counts.

Returns

  • Vector of tuples (filepath, tokencount) sorted by token count descending
source
RepoPacker.is_text_fileMethod
is_text_file(path::AbstractString)

Check if a file is a text file based on its extension.

Arguments

  • path: File path to check

Returns

  • true if the file has a recognized text extension, false otherwise

Examples

RepoPacker.is_text_file("src/RepoPacker.jl")  # returns true
RepoPacker.is_text_file("docs/logo.png")      # returns false
source
RepoPacker.neglect_pathMethod
neglect_path(path::AbstractString)

Add a path (file or directory) to be excluded from packing. Paths are matched as substrings in the full file path (relative to repo root).

Arguments

  • path: Path pattern to exclude (can be relative or absolute)

Examples

RepoPacker.neglect_path("test/")
RepoPacker.neglect_path(".env")
source
RepoPacker.pack_directoryFunction
pack_directory(dir_path::AbstractString, output_file::AbstractString="repo.xml"; 
               output_style::Symbol=:xml, verbose::Bool=false)

Pack the text files in a directory into a file in the specified format.

Arguments

  • dir_path: Directory path to pack
  • output_file: Output file path (default: "repo.xml")
  • output_style: Format to use (:xml, :json, or :markdown)
  • verbose: Whether to enable detailed logging (default: false)

Returns

  • Path to the generated output file

Examples

RepoPacker.pack_directory(".", "repo.xml")

Errors

  • Throws ArgumentError if directory doesn't exist
source
RepoPacker.should_neglectMethod
should_neglect(full_path::AbstractString, base_dir::AbstractString)

Check if a file should be excluded based on the global NEGLECTPATHS list. Compares against both absolute and relative (to basedir) paths.

Arguments

  • full_path: Absolute path of the file
  • base_dir: Base directory of the repository

Returns

  • true if the file should be excluded, false otherwise
source