Keyboard shortcuts

Press ← or β†’ to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

πŸ—ΊοΈ Overview


🌱 ACORN1 provides and operationalizes an ontology for research activity data (RAD) and enables adding linked data context and transforming RAD into a knowledge graph that is amenable to automated reasoning and artifact generation (e.g. PDFs, PPTX, etc.)

ACORN is a command line multi-tool that employs automated processes for informing and enforcing defined content schemas. With these content schemas, ACORN builds communication assets such as PDFs, presentation files, and web pages. It also lays the foundation for deep data insights about ORNL’s β€” and any institution’s β€” corpus of research. Built using the memory-safe Rust programming language, ACORN can be used on any Windows, Mac, or Linux machine


ACORN Input/Output


  1. ACORN stands for β€œAccessible Content Optimization for Research Needs ↩

πŸ€” So what? Big deal? Who cares?

Important

If you do science, you need ACORN.

🎯 What is ACORN trying to solve?

Accessible Content Optimization for Research Needs, ACORN, allows science achievers and communicators to create analysis-ready research activity data. ACORN’s associated schemas standardize how research is codified and communicated to capture the entire research architecture, including what research is being done, how it’s done, and how it all relates.

Built with memory-safe Rust, ACORN is a portable solution that uses statically typed and dynamically validated data structures. Schemas include unique identifiers1 to support open science principles and practices.

ACORN can help inform researchers, sponsors, and partners, as well as train machines on existing research projects β€” and identify where gaps in our research exist.

πŸ†• How is it novel?

ACORN focuses on research at the project level. While there are other systems that track research projects tangentially β€” through people, publications, or organizations β€” there is no system that we know of that employs projects as the primitive source of information. We believe codifying research activity data at the project level is the key to unlocking deep insights about an organization’s resources, funding, partnerships, and accomplishments. This network of linked data surpasses what may be already present but siloed in current search tools.

With existing tools, we can see

  • publications2, but we don’t know about projects without journal publications or how projects relate,
  • people3, but people move groups, change job roles, and leave organizations, and
  • approvals and timelines, but these systems aren’t often designed to integrate with other PIDs or systems
  • budgets, but budgets do not include the level of project details necessary to capture appropriate scientific understanding

ACORN does not replace or remove these systems. Each supports part of the puzzle. ACORN can help ensure they integrate and work together to provide users with valuable information from a central source of truth, no matter their entry point.

⚠️ What are the risks?

There are very few risks to ACORN. They would simply include anything that could preclude continued development of ACORN like lack of funding or human resources to support further development of the project. They also include any security risks inherent in projects at locations that host potentially sensitive data.

Even if it completely fails4, our β€œescalator turns to stairs” and will never truly fail. It is open-source5, designed to be local and decentralized, and built on existing systems, with information stored in flat files so it will never be lost or unshareable.

Tip

Research activity data curated and controlled by ACORN and its associated schemas and processes will never be without value.

🎁 Why should I use ACORN?

You should use ACORN if you want your project to be part of the research conversation. We initially developed ACORN for researchers, to help capture and communicate their projects more effectively. ACORN helps cut down on administrative burden, allowing the PI or a project designee to submit descriptive metadata to a form and receive automatically generated fact sheets in PDF, web, or PowerPoint form. This form, kept as a JSON file in a designated repository6, becomes the single source of truth for information on each project. We continue to develop the metadata schema and talk with research groups to benefit organization processes.


  1. ACORN provides CURIEs and will soon also work with research activity identifiers (RAiD) ↩

  2. e.g., DOIs ↩

  3. e.g., ORCiDs ↩

  4. Realistically, the only mode of failure is complete lack of adoption… ↩

  5. https://www.osti.gov/doecode/biblio/156286 ↩

  6. We call these repositories, β€œbuckets” (see the page on buckets for details) ↩

πŸš€ Getting Started

Important

The most popular way to use ACORN is to not install it at all β€” simply leverage the power and promise of ACORN within a CI job (see this example .gitlab-ci.yml)

Installation

Install with Scoop (Windows only)

Warning

🚧 Under construction (Pull Request/Issue)

  • Ensure you have Scoop installed.

  • Open your terminal and add the extras bucket to Scoop

    scoop bucket add extras
    
  • Run the following command to install ACORN

    scoop install acorn
    
  • After installation, you can verify the installation by running:

    acorn help
    

Install with Homebrew (Linux and MacOS)

Warning

🚧 Under construction

Install with Cargo

  • Ensure you have Rust and Cargo installed. If not, you can install them using rustup.rs.

  • Open your terminal and run the following command to install ACORN

    cargo install acorn-cli
    
  • After installation, you can verify the installation by running:

    acorn help
    

Run Inside a Container

  • Ensure you have Docker1 installed.

  • Pull the latest ACORN image with Docker1

    docker pull savannah.ornl.gov/research-enablement/acorn/runner
    
  • Run ACORN using Docker1

    docker run --rm savannah.ornl.gov/research-enablement/acorn/runner help
    

Tip

You will probably need to mount a volume to use ACORN with your local files. For example:

docker run --rm -v $(pwd):/data savannah.ornl.gov/research-enablement/acorn/runner:latest check /data/project

Download pre-compiled binary

  • Download the latest release from the ACORN GitHub Releases
    • πŸͺŸ Windows

      • Open a PowerShell terminal and run:

        irm -OutFile acorn.exe -Uri https://code.ornl.gov/api/v4/projects/16689/packages/generic/x86_64-pc-windows-gnu/v0.1.50/acorn.exe
        
      • Test the downloaded executable:

        .\acorn.exe help
        
    • 🐧 Linux

      • Open a terminal and run:

        curl -LO https://code.ornl.gov/api/v4/projects/16689/packages/generic/x86_64-unknown-linux-musl/v0.1.45/acorn
        
      • Make the downloaded file executable:

        chmod +x acorn
        
      • Test the downloaded executable:

        acorn help
        
    • 🍎 MacOS

      🚧 Under construction

Tip

Move the binary to a directory included in your system’s PATH for system-wide access.

Install from source

  • Clone the ACORN repository

    git clone https://code.ornl.gov/research-enablement/acorn.git
    cd acorn
    
  • Install acorn command

    cargo install --path ./acorn-cli
    
  • After installation, you can verify the installation by running:

    acorn help
    

  1. These instructions will work with any OCI-compliant container runtime, such as Docker or Podman. ↩ ↩2 ↩3

Concepts

ACORN is built around a few core concepts that help you understand how to use it effectively. This section will introduce you to these concepts and explain their significance in the context of ACORN and science.

Motivation

For the last few decades, scientific progress has been driven by publishing papers in, ideally peer reviewed, scientific journals. Publishing affords researchers with recognition, career advancement, and funding opportunities. This model has worked well for a long time, but it has some serious limitations that ACORN aims to address.

  1. Publish or perish1: Scientists and researchers often care more about being published than about sharing scientific knowledge. This is not without reason, as publications are a key metric for career advancement and funding. However, this can lead to a focus on quantity over quality, and a reluctance to share negative results or data that does not support a hypothesis.
  2. Reproducibility crisis2: Many scientific results are difficult or impossible to reproduce, leading to questions about their validity. This is compounded by the fact that many publications do not provide access to the underlying data or code used in the research.
  3. No clear way to demonstrate science: Publications, people, and budgets do not tell the whole story. Science needs a cross-domain standard to collect and communicate the full context of scientific endeavors, including data, code, methods, and results.

ACORN addresses these issues by reducing the administrative burden of codifying the details of a research project and providing an automated framework for the sharing and analysis of scientific knowledge. Predicated on the idea of applying β€œscience all the way down”, ACORN applies rigorous scientific principles to the management and dissemination of scientific knowledge itself.

Research Enablement

The research enablement initiative at Oak Ridge National Laboratory (ORNL) aims to improve the way scientific research is conducted, shared, and evaluated. It is a team of developers, communication experts, and information scientists passionate about open science, transparency, and improving the researcher experience.

REI works toward its goals by providing:

  • A cross-domain model of research activity data (RAD)
    • ACORN - Schema, controlled vocabularies, and ontologies for describing research activities
    • ASPECT - Attribute of ACORN that describes technology associated with a given research activity
  • A command-line application written in Rust - acorn-cli
  • A Rust crate for working with RAD - acorn-lib
  • A Python package for working with RAD - acorn-py
  • A catalog of research activity data at ORNL - research.ornl.gov

πŸͺ£ Buckets

Research activity data is persisted in versioned folder collections called β€œbuckets”. Each bucket contains a set of files and media assets that describe research activities. Buckets can be stored locally or in a remote repository, such as a GitLab or GitHub repository. For multiple examples, see ORNL’s buckets.

Buckets can be combined via a flat-file3 configuration file using the acorn CLI tool. This allows users to aggregate data from multiple sources and generate reports or other outputs based on the combined data.

Buckets are designed to be flexible and extensible - buckets do not require a cloud provider or expensive infrastructure to use. They can be stored in any version-controlled repository, such as GitLab, GitHub, or even a local file system.

Buckets are designed to enable federation and scaling while maintaining low level control over permissions and access. This allows organizations to share data across teams and departments while maintaining control over who can access and modify the data.

Tip

See the research enablement wiki for more information on buckets.


  1. D. R. Grimes, C. T. Bauch, and J. P. A. Ioannidis, β€œModelling science trustworthiness under publish or perish pressure,β€β€œ Royal Society Open Science, vol. 5, no. 1, p. 171511, Jan. 2018, doi: 10.1098/rsos.171511. ↩

  2. M. Baker, β€œ1,500 scientists lift the lid on reproducibility,β€β€œ Nature, vol. 533, no. 7604, Art. no. 7604, May 2016, doi: 10.1038/533452a. ↩

  3. A flat-file is a simple text file that contains data in a structured format, such as JSON or YAML. Flat-files are easy to read and write, and can be used to store configuration data for applications. ↩

ACORN Schemas and Ontologies

ACORN Schema Overview

ASPECT

A Scientific Prescription for the Efficient Classification of Technology

The ASPECT framework is a standardized methodology for classifying and describing technology components within the ACORN ecosystem. It provides a structured approach to defining the attributes and relationships of various technological elements, ensuring consistency and interoperability across different systems and applications.

ASPECT was designed with the goal of unifying our understanding of automation, AI/ML technology, and β€œclassical” software. We focus on β€œtechnology” instead of β€œAI/ML technology” because the latter is a subset of the former. Furthermore, focusing on AI/ML as the end goal is not fruitful or correct. In fact, doing so is backwards. AI/ML software is not novel in any meaningful sense. Even if it was, it would still be 100% predicated on the scientific principles of software.

In the context of technology, AI/ML and automation are the same.

Tip

Read docs.rs documentation for ASPECT

Key Components

click arrow to expand or collapse

πŸ’Ό Portability
  • Limited
  • Source
  • Containerized
  • Installer
  • Automated Installer
  • WebAssembly
🀝 Autonomy
  • Manual
  • Machine-assisted
  • Human-as-primary
  • Machine-as-primary
  • Human-supervised
  • Machine-only
πŸ“ˆ Maturity

Maturity uses an augmented version of technology readiness levels (TRL) and includes levels 1 through 9

🦾 Motivity
  • Type 0
  • Type 1A
  • Type 1B
  • Type 2
πŸ’Ύ Data
  • Real or Synthetic
  • Availability
  • Modality
  • Quality
πŸ–₯️ (Hardware) Resources
  • CPU
  • GPU
  • TPU
  • FPGA
  • Quantum
  • Neuromorphic
  • Other
🎯 Task Classification
  • Perceive
  • Reason
  • Project

Real-world Example

The ASPECT framework can be applied in various scenarios

  • AI-driven Gravity Mapping
    • πŸ’Ό Source (Level 1)
    • 🀝 Machine-assisted (Level 1)
    • πŸ“ˆ Developed (TRL 5)
    • 🦾 Type 2
    • πŸ’Ύ Trained on Real, Unavailable, Silver quality, Textual modality data
    • πŸ–₯️ GPU
    • 🎯 Perceive, Reason

πŸ’Ό Software Portability

Software portability refers to the ease with which software applications or components can be transferred and adapted to operate in different computing environments, platforms, or systems with minimal modification. This ASPECT attribute is crucial for ensuring that software can function effectively across diverse hardware configurations, operating systems, and cloud environments.

🀝 Autonomy

In the context of ASPECT, β€œautonomy” characterizes a technology’s level of human-machine teaming and describes the adaptive bi-directional team interaction among humans and machines that augments human capabilities for improved outcomes. This attribute builds on prior work by the Society of Automotive Engineers’ six levels of driving automation and expands on ISO definitions to partition technology into distinct and employable categories.

Tip

ISO 229891 defines human-machine teaming as β€œintegration of human interaction with machine intelligence capabilities.”

Levels

Manual (HMT 0)

The execution of a simple script does not imply autonomy beyond manual operation. This level is characterized by the execution of a script where the deterministic outcome is fully known and controlled by the human operator.

Human Task

Machine-assisted (HMT 1)

In this level, the machine provides assistance to the human operator in executing a task. The human remains in full control of the task execution, with the machine offering support or suggestions as needed. This might include an iterative script that augments the input during each iteration based on prior outputs.

Machine Human Task

Human as primary (HMT 2)

Machine Human Task

Machine as primary (HMT 3)

Human Machine Task

Human supervisor (HMT 4)

Human Machine Task

Machine only (HMT 5)

Machine Task

  1. ISO/IEC 22989:2022 Information Technology β€” Artificial Intelligence Concepts and Terminology ↩

πŸ“ˆ Maturity

Levels

  • β€œGreenfield” research
  • Basic research (β€œGoal-oriented research”, TRL 1)
  • Technology concept (β€œProof of principle”, TRL 2)
  • Feasible (β€œSystems development”, TRL 3)
  • Developing (β€œProof of concept”, TRL 4)
  • Prototype (β€œApplication development”, TRL 5)
  • Operational (β€œIntegrations”, TRL 6)
  • Mission ready (TRL 7)
  • Mission capable (β€œDeployment”, TRL 8)

Tip

Read docs.rs documentation for maturity levels

Discussion

ASPECT leverages an augmented version of the maturity indication levels of technology readiness levels (TRL). Wikipedia defines TRLs as β€œa method for estimating the maturity of technologies during the acquisition phase of a program. TRLs enable consistent and uniform discussions of technical maturity across different types of technology.” TRLs are widely used across the U.S. federal acquisitions community to assess the maturity of a particular technology. Additionally, ASPECT incorporates work1 that adapted TRLs to directly address the particular nuances of machine learning systems. Additionally, ASPECT maturity levels work well with the capability maturity model, but go beyond maintenance processes and efficiency to characterize the maturity of the technology, in practice. Specifically, ASPECT openness and portability attributes create a well-defined understanding of a given technology.

Ultimately, the maturity attribute of ASPECT combines with the other ASPECT attributes to provide a holistic view of the system under consideration, beyond simply how long the technology has been around or how well maintained it is. Maturity in ASPECT reflects the degree to which a system has been tested, validated, and proven in real-world scenarios, as well as its readiness for deployment and integration into existing workflows.


  1. A. Lavin et al., β€œTechnology readiness levels for machine learning systems,” Nat Commun, vol. 13, no. 1, p. 6039, Oct. 2022, doi: 10.1038/s41467-022-33128-9. TRLs for ML systems ↩

🦾 Motivity

Discussion

β€œMotivity” describes a technology’s ability to exert power over its environment. In the context of ASPECT, this involves how a given technology interacts with its environment and the degree of autonomy it possesses in performing tasks. As a model of interaction, motivity is built on task classification categories and is closely related to the human-machine teaming level. Motivity can also be viewed in terms of data binding - no binding, one-way binding, and two-way binding.

Motivity is a somewhat uncommon word. It was chosen for the ASPECT framework in part because deliberate ambiguity can foster conceptual depth. Rarity minimizes external preconceptions, enabling custom layering of meanings without the baggage of a widely used term.

Motivity has been defined in various contexts across philosophy, biology, and psychology, often emphasizing intrinsic capacity for motion or change, which makes sense considering its etymology, emphasizing an intrinsic ability rather than external force. Motivity uniquely captures an inherent β€œmotive power” or self-initiating force for change, aligning with a data model’s bidirectional synchronization as an active, propulsive property rather than passive reactivity (which implies response) or linkage (structural connection).

Similar niche terms like β€œaffordance” in HCI gained traction despite initial obscurity and today offer rich, nuanced meanings.

Types

Motivity is logically partitioned into four distinct types based on the level of interaction with the environment. Motivity is built around the same tasks in task classification - perception, projection, and comprehension, which essentially act as proxies for one-way data binding (input), one-way data binding (output), and computation, respectively.

Type 0

Note

Example Digital model

Type 0 has no interaction with the environment. Type 0 agency can involve β€œcomprehension” (e.g., simulations) but also includes technology that does not (e.g., a hammer). Although comprehension is optional in every type, including Type 0, in the case of Type 0 technologies it is not a very interesting thing to consider.

Env Self Β Β Β Β Comprehension Self

Type 1A

Note

Example Data shadow1

Type 1A is arguably the most common form of technology, at least in the context of research. Although it is mildly ambiguous in interpretation, the use of an interactive notebook (e.g., Jupyter) to visualize the distribution of some data read from a CSV file would be considered as Type 1A since the technology reads input from the environment (a CSV file) and presents an aspect of the data visually. Technically, the associated image is a created artifact stored with zeroes and ones, but for the purpose of classification via purpose of agency, we consider such ephemeral artifacts as insufficient to justify Types 1B or 2.

Env Self Β Β Β Β Comprehension Self Perception

Type 1B

Note

Example Cron job that emails status of long running calculation

Self ComprehensionΒ Β Β Β  Self Projection Env

Type 2

Note

Example Digital twin1

Env Perception Self ComprehensionΒ Β Β Β  Self Projection Env

  1. Y. K. Liu, S. K. Ong, and A. Y. C. Nee, β€œState-of-the-art survey on digital twin implementations,” Adv. Manuf., vol. 10, no. 1, pp. 1-23, Mar. 2022, doi: 10.1007/s40436-021-00375-w. ↩ ↩2

πŸ’Ύ Data

Warning

This documentation is a work in progress. Some sections may be incomplete or subject to change.

πŸ–₯️ Resources

Warning

This documentation is a work in progress. Some sections may be incomplete or subject to change.

🎯 Task Types

Warning

This documentation is a work in progress. Some sections may be incomplete or subject to change.

Citations

A selection of academic papers and resources that have influenced the development of ACORN

[1] E. Njor, M. A. Hasanpour, J. Madsen, and X. Fafoutis, β€œA Holistic Review of the TinyML Stack for Predictive Maintenance,” IEEE Access, vol. 12, pp. 184861-184882, 2024, doi: 10.1109/ACCESS.2024.3512860.

[2] Y. Yang et al., β€œA Survey of AI Agent Protocols,” Apr. 26, 2025, arXiv: arXiv:2504.16736. doi: 10.48550/arXiv.2504.16736.

[3] B. Liu et al., β€œAdvances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems,” Mar. 31, 2025, arXiv: arXiv:2504.01990. doi: 10.48550/arXiv.2504.01990.

[4] β€œAI Blindspot: A Discovery Process for preventing, detecting, and mitigating bias in AI systems.” Accessed: Jan. 24, 2023. [Online]. Available: https://aiblindspot.media.mit.edu/

[5] V. Gadepally et al., β€œAI Enabling Technologies: A Survey,” May 08, 2019, arXiv: arXiv:1905.03592. doi: 10.48550/arXiv.1905.03592.

[6] A. Jain, S. Sharma, and S. Duggal, β€œComparative Study of Various Process Model in Software Development,” 2013. Accessed: Jan. 24, 2023. [Online]. Available: semanticscholar.org

[7] Q. Hua et al., β€œContext Engineering 2.0: The Context of Context Engineering,” Oct. 30, 2025, arXiv: arXiv:2510.26493. doi: 10.48550/arXiv.2510.26493.

[8] N. D. Lawrence, β€œData Readiness Levels,” May 05, 2017, arXiv: arXiv:1705.02245. doi: 10.48550/arXiv.1705.02245.

[9] A. Fuller, Z. Fan, C. Day, and C. Barlow, β€œDigital Twin: Enabling Technologies, Challenges and Open Research,” IEEE Access, vol. 8, pp. 108952-108971, 2020, doi: 10.1109/ACCESS.2020.2998358.

[10] J. Gou, B. Yu, S. J. Maybank, and D. Tao, β€œKnowledge Distillation: A Survey,” Int J Comput Vis, vol. 129, no. 6, pp. 1789-1819, June 2021, doi: 10.1007/s11263-021-01453-z.

[11] D. Kreuzberger, N. KΓΌhl, and S. Hirschl, β€œMachine Learning Operations (MLOps): Overview, Definition, and Architecture,” May 14, 2022, arXiv: arXiv:2205.02302. doi: 10.48550/arXiv.2205.02302.

[12] M. Mitchell et al., β€œModel Cards for Model Reporting,” in Proceedings of the Conference on Fairness, Accountability, and Transparency, Jan. 2019, pp. 220-229. doi: 10.1145/3287560.3287596.

[13] E. Blasch, J. Sung, and T. Nguyen, β€œMultisource AI Scorecard Table for System Evaluation,” Feb. 07, 2021, arXiv: arXiv:2102.03985. doi: 10.48550/arXiv.2102.03985.

[14] F. Yu, H. Zhang, and B. Wang, β€œNatural Language Reasoning, A Survey,” Mar. 26, 2023, arXiv: arXiv:2303.14725. doi: 10.48550/arXiv.2303.14725.

[15] S. Zhao, Y. Yang, Z. Wang, Z. He, L. K. Qiu, and L. Qiu, β€œRetrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely,” Sept. 23, 2024, arXiv: arXiv:2409.14924. Accessed: Oct. 02, 2024. [Online]. Available: arxiv.org

[16] Y. K. Liu, S. K. Ong, and A. Y. C. Nee, β€œState-of-the-art survey on digital twin implementations,” Adv. Manuf., vol. 10, no. 1, pp. 1-23, Mar. 2022, doi: 10.1007/s40436-021-00375-w.

[17] Center for Security and Emerging Technology and B. Buchanan, β€œThe AI Triad and What It Means for National Security Strategy,” Center for Security and Emerging Technology, Aug. 2020. doi: 10.51593/20200021.

[18] J. M. Bradshaw, R. R. Hoffman, D. D. Woods, and M. Johnson, β€œThe Seven Deadly Myths of β€˜Autonomous Systems,’” IEEE Intelligent Systems, vol. 28, no. 3, pp. 54-61, May 2013, doi: 10.1109/MIS.2013.70.

[19] M. R. Endsley, β€œToward a Theory of Situation Awareness in Dynamic Systems. Human Factors Journal 37(1), 32-64,” ResearchGate, Aug. 2025, doi: 10.1518/001872095779049543.

Command Line Reference


   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ     β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ     β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β–ˆβ–ˆβ–ˆβ–’β–’β–’β–’β–’β–ˆβ–ˆβ–ˆ   β–ˆβ–ˆβ–ˆβ–’β–’β–’β–’β–’β–ˆβ–ˆβ–ˆ  β–ˆβ–ˆβ–ˆβ–’β–’β–’β–’β–’β–ˆβ–ˆβ–ˆ β–’β–’β–ˆβ–ˆβ–ˆβ–’β–’β–’β–’β–’β–ˆβ–ˆβ–ˆ β–’β–’β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β–’β–’β–ˆβ–ˆβ–ˆ
 β–’β–ˆβ–ˆβ–ˆ    β–’β–ˆβ–ˆβ–ˆ  β–ˆβ–ˆβ–ˆ     β–’β–’β–’  β–ˆβ–ˆβ–ˆ     β–’β–’β–ˆβ–ˆβ–ˆ β–’β–ˆβ–ˆβ–ˆ    β–’β–ˆβ–ˆβ–ˆ  β–’β–ˆβ–ˆβ–ˆβ–’β–ˆβ–ˆβ–ˆ β–’β–ˆβ–ˆβ–ˆ
 β–’β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β–’β–ˆβ–ˆβ–ˆ         β–’β–ˆβ–ˆβ–ˆ      β–’β–ˆβ–ˆβ–ˆ β–’β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ   β–’β–ˆβ–ˆβ–ˆβ–’β–’β–ˆβ–ˆβ–ˆβ–’β–ˆβ–ˆβ–ˆ
 β–’β–ˆβ–ˆβ–ˆβ–’β–’β–’β–’β–’β–ˆβ–ˆβ–ˆ β–’β–ˆβ–ˆβ–ˆ         β–’β–ˆβ–ˆβ–ˆ      β–’β–ˆβ–ˆβ–ˆ β–’β–ˆβ–ˆβ–ˆβ–’β–’β–’β–’β–’β–ˆβ–ˆβ–ˆ  β–’β–ˆβ–ˆβ–ˆ β–’β–’β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ €
 β–’β–ˆβ–ˆβ–ˆ    β–’β–ˆβ–ˆβ–ˆ β–’β–’β–ˆβ–ˆβ–ˆ     β–ˆβ–ˆβ–ˆβ–’β–’β–ˆβ–ˆβ–ˆ     β–ˆβ–ˆβ–ˆ  β–’β–ˆβ–ˆβ–ˆ    β–’β–ˆβ–ˆβ–ˆ  β–’β–ˆβ–ˆβ–ˆ  β–’β–’β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
 β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β–’β–’β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  β–’β–’β–’β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–’   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  β–’β–’β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ €
β–’β–’β–’β–’β–’   β–’β–’β–’β–’β–’   β–’β–’β–’β–’β–’β–’β–’β–’β–’     β–’β–’β–’β–’β–’β–’β–’    β–’β–’β–’β–’β–’   β–’β–’β–’β–’β–’ β–’β–’β–’β–’β–’    β–’β–’β–’β–’β–’
β €β €β €β €β €β €β €β €~ Accessible Content Optimization for Research Needs ~β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €β €

Usage: acorn [FLAGS] [COMMAND]

COMMANDS:
  check     Perform static analysis on research activity data and apply standardized best practices
  doctor    Diagnose and correct system requirements for using acorn
  download  Download research activity data from buckets
  export    Export research activity data to a specific target
  format    Formats research activity data in place (inherently includes some elements of `acorn check`)
  link      Add linked data context to research activity data
  schema    Print research activity data (RAD) or research activity identifier (RAiD) metadata JSON schema to stdout
  help      Print this message or the help of the given subcommand(s)

FLAGS:
  -X, --offline
          Prevent communication with the internet - intended for disconnected local environments

          Note: Use of --offline may require extra configuration options for certain commands

  -t, --threads <N>
          Limit number of threads used by rayon for parallel processing

          See Rayon documentation for more information

          [default: 0]

  -v, --verbose...
          Increase logging verbosity

  -q, --quiet...
          Decrease logging verbosity

  -V, --version
          Prints version information

  -h, --help
          Print help (see a summary with '-h')

Configuration

Flat-file

ACORN can be configured using JSON or YAML format files. These configuration files allow users to specify various options and settings for the ACORN CLI tool, including input and output directories, logging levels, and other parameters.

Example

.acorn.json file

{
    "buckets": [
        {
            "name": "bessd",
            "repository": {
                "provider": "gitlab",
                "id": 17603,
                "uri": "https://code.ornl.gov/research-enablement/buckets/bessd"

            }
        },
        {
            "name": "ccsd",
            "repository": {
                "provider": "gitlab",
                "id": 17602,
                "uri": "https://code.ornl.gov/research-enablement/buckets/ccsd"
            }
        },
        {
            "name": "nssd",
            "repository": {
                "provider": "gitlab",
                "id": 17410,
                "uri": "https://code.ornl.gov/research-enablement/buckets/nssd"
            }
        }
    ]
}

Tip

.acorn.json is the default configuration file name that ACORN looks for in the current working directory. You can also specify a different configuration file using the --config <FILE> flag when running ACORN commands.

.env file and Environment Variables

The ACORN CLI tool can also be configured using .env files and/or environment variables.

Example

.env file

ACORN_LOG_LEVEL=info
READABILITY_METRIC=ari
MAX_ALLOWED_ARI=12

Commands

Command Workflow

stateDiagram
    direction LR
    [*] --> RAD: create
    RAD --> check
    check --> RAD
    check --> format
    format --> link
    format --> export

πŸ•΅οΈβ€β™‚οΈ Audit

Audit research activity data for completeness, consistency, and adherence to best practices - provide results in tabular format for easy review

Warning

This documentation is a work in progress. Some sections may be incomplete or subject to change.

βœ… Check

Perform various checks on associated research activity data

Example Usage

# Check a specific research activity index
acorn check path/to/project/index.json

# Check all research activity data in a directory
acorn check path/to/project/

Categories of Checks

  • πŸ›οΈ Schema Validation: Ensure that all data files conform to the expected schema
  • ✨ Prose Quality: Analyze written content for grammar, spelling, and more
  • πŸ‘“ Readability: Evaluate the readability of textual content using established metrics1
  • πŸ”— Link Integrity: Verify that all hyperlinks within the content are valid and reachable
  • πŸ“Š Data Consistency: Check for consistency and completeness in datasets
  • 🚦 Convention Adherence: Ensure that naming conventions and organizational standards are followed

Customization Options

The check command supports several flags and options to customize its behavior (e.g., skipping certain checks, disabling certain behaviors, etc.)

Include --exit-on-first-error to stop execution upon encountering the first error.

Bypass verifying the checksum of downloaded artifacts with --skip-verify-checksum2

Skip checks

  • --skip schema : Skip schema validation checks
  • --skip prose : Skip prose quality checks
  • --skip readability : Skip readability checks
  • --skip schema,prose : Skip both schema validation and prose quality checks (works for any combination of categories)
  • --disable-website-checks : Disable all website-related checks (link integrity, etc.)

Note

--disable-website-checks is redundant when acorn --offline is used for commands that need to be run in offline environments.

Configure Readability

Readability can be configured by passing options directly to the command line or via a .env file. Command line options override .env settings.

  • --readability-metric <METRIC> : Specify which readability metric to use
  • Set READABILITY_METRIC in your .env file to choose the readability metric. Default metric is fkgl (Flesch-Kincaid Grade Level).
  • Set MAX_ALLOWED_FKGL in your .env file to define the maximum acceptable FKGL score. Each metric has its own corresponding maximum score variable (e.g., MAX_ALLOWED_ARI for Automated Readability Index).

Example .env file

Configure ACORN to use the Coleman-Liau Index (CLI) readability metric with a maximum allowed score of 14.0 (default value is 12.0):

READABILITY_METRIC=cli
MAX_ALLOWED_CLI=14.0

  1. See the readability module documentation for a full list of available readability metrics. ↩

  2. ⚠️ Skipping checksum verification may expose you to security risks. Use this option with caution. ↩

πŸ‘¨β€βš•οΈ Doctor

Diagnose and fix issues with host environment to enable ACORN functionality

Warning

This documentation is a work in progress. Some sections may be incomplete or subject to change.

πŸ“₯ Download

Obtain files from an ACORN bucket to your local filesystem

  • See the configuration documentation for details on configuring ACORN commands.
  • By default, download will save files to ./content in the current working directory unless an output path is specified via the --output flag.

Example Usage

# Download research activity data from a list of buckets
acorn download --config /path/to/.acorn.json

# Download research activity data to a specific output directory
acorn download --config /path/to/.acorn.yml --output /path/to/output

Local vs Remote

The download command copies files from a local ACORN bucket when local file:// URIs are used for associated buckets in the configuration file. Use "git" as the provider for local buckets.

"buckets": [
    {
        "name": "test (local)",
        "repository": {
            "provider": "git",
            "location": "file:./tests/fixtures/data/bucket/"
        }
    },
    {
        "name": "nssd (remote)",
        "repository": {
            "provider": "gitlab",
            "location": {
                "scheme": "https",
                "uri": "https://code.ornl.gov/research-enablement/buckets/nssd"
            }
        }
    }
]

GitLab vs GitHub

The download command supports both GitLab and GitHub remote repositories for ACORN buckets. The configuration for each is similar, with the main difference being the provider field in the repository object.

"buckets": [
    {
        "name": "ccsd (gitlab)",
        "repository": {
            "provider": "gitlab",
            "id": 17410,
            "uri": "https://code.ornl.gov/research-enablement/buckets/ccsd"
        }
    },
    {
        "name": "test (github)",
        "repository": {
            "provider": "github",
            "uri": "https://github.com/jhwohlgemuth/bucket"
        }
    }
]

πŸ“€ Export

Export research activity data to various formats for analysis and sharing

This command allows you to export research activity data from ACORN into different formats such as PDF, Markdown, YAML, or PPTX. With export, you can easily share your research with sponsors, collaborators, or the general public in a variety of contexts.

export supports exporting individual research activity indices or entire directories containing multiple indices.

ACORN allows one to maintain research activity data as persistent interconnected single sources of truth, which allows one to easily create a variety of output artifacts while ensuring consistency and accuracy across all selected formats.

graph LR
    data("JSON</br>(index.json)") --> export{Export}
    export --> PDF("PDF</br>(fact sheet)")
    export --> PPTX("PPTX</br>(presentation)")
    export --> more("More to come...</br>(Markdown, YAML, etc.)")

Tip

You can see the result of the export command in action by visiting the ORNL Research Activity Index, which features a variety of research activity data presented in different formats.

Example Usage

# Export research activity data to PDF fact sheet
acorn export /path/to/index.json --format pdf

# Create (single slide) PowerPoint presentations from all research activity data in a directory
acorn export /path/to/project/ --format powerpoint

PowerPoint Reference Template

You can customize the PowerPoint export by providing a reference template using the --reference option. This allows you to define specific styles, layouts, and branding for your presentations.

acorn export /path/to/index.json \
    --format powerpoint \
    --reference /path/to/reference.pptx

The reference template allows you to specify which and where certain values are used using placeholder text in the format {{ PLACEHOLDER_NAME }}. During export, ACORN will replace these placeholders with the corresponding data from the associated research activity data. You can find an example PowerPoint reference template used for testing in the ACORN GitLab repository.

Available Placeholders

The following placeholders can be used in your PowerPoint reference template:

String values

  • caption - First image caption
  • challenge - Challenge description
  • citation - DOI citation
  • email - Contact email
  • first - Contact first name
  • focus - Research focus area
  • last - Contact last name
  • managers - Manager names (joined with "and")
  • mission
  • notes - Presentation notes (intended to be added PowerPoint speaker notes)
  • partners - Partner names (joined with ", ")
  • programs - Program names (joined with "and")
  • subtitle
  • title

Lists (bullet points)

  • achievement
  • areas - Research areas
  • impact
  • technical - Technical approach

πŸ€– Format

Auto-fix and format research activity data (RAD) files to maintain consistency, resolve values, and improve prose quality

Example Usage

# Format a specific research activity index
acorn format path/to/project/index.json

# Format all research activity data in a directory
acorn format path/to/project/

# Preform dry-run to see proposed changes without modifying files
acorn format path/to/project/index.json --dry-run

Example Output

  meta:
    keywords:
-   - automatin
+   - automation
    technology:
-   - JavaScript
-   - TypeSpec
    - astro
+   - javascript
    - react
-   - rs
+   - rust
+   - typespec
    sponsors:
-   - DOD
+   - Department of Defense
    ...
  contact:
    jobTitle: Primary Investigator
    givenName: Jasdrey
    familyName: Wohlson
    email: me@example.com
-   telephone: '(123) 456-7890'
+   telephone: '123.456.7890'
    url: https://www.ornl.gov/staff-profile/jason-h-wohlgemuth
-   organization: GSHS
+   organization: Geospatial Science and Human Security Division
+   affiliation: National Security Sciences Directorate

Features

  • πŸ› οΈ Auto-fixing: Automatically fix common inconsistencies in RAD structure and prose
  • 🎨 Consistent Formatting: Ensure consistent JSON formatting across all data files
  • 🩹 Resolve Values: Resolve certain values against controlled vocabularies to ensure meaning is conveyed correctly
  • πŸ–ΌοΈ Resolve missing images: Find first image in associated RAD folders and add to metadata if missing

Resolved Values

  • meta.keywords: Resolve keywords against ACORN Keywords Vocabulary
  • meta.technology: Resolve technology against ACORN Technology Vocabulary
  • meta.partners: Resolve partner names against ACORN Partners Vocabulary
  • meta.sponsors: Resolve sponsor names against ACORN Sponsors Vocabulary
  • contact.organization: Resolve organization name against a given org chart (currently only supports ORNL org chart)
  • contact.affiliation: Resolve organization name against a given org chart (currently only supports ORNL org chart)

Examples

  • Keywords: "ai" β†’ resolves to "artificial-intelligence"
  • Partners: "NREL" β†’ resolves to "National Renewable Energy Laboratory"
  • Technologies: "rs" β†’ resolves to "rust"
  • Sponsors: "Dept. of Energy" β†’ resolves to "Department of Energy"

πŸ•ΈοΈ Link

Add linked data context to research activity data and create JSON-LD documents

The link command augments research activity data (RAD), as provided by a human user, with linked data context and outputs JSON-LD documents. This process involves mapping the input data to established ontologies and vocabularies, thereby enhancing its interoperability and semantic richness.

ACORN enables ingesting RAD in a sort of β€œpre-compacted” format. The link command makes the input RAD machine-readable, ready for programmatic expansion.

Tip

Linked Data empowers people that publish and use information on the Web. It is a way to create a network of standards-based, machine-readable data across Web sites.

Example Usage

# Link a specific research activity index
acorn link path/to/project/index.json

# Link all research activity data in a directory
acorn link path/to/project/

Tip

The link command is almost identical in usage to the format command, complete with support for the --dry-run flag to preview changes without creating files.

Example Output

    "contact": {
+     "@context": {
+       "jobTitle": "https://schema.org/jobTitle",
+       "givenName": "https://schema.org/givenName",
+       "familyName": "https://schema.org/familyName",
+       "identifier": "https://orcid.org",
+       "email": "https://schema.org/email",
+       "telephone": "https://schema.org/telephone",
+       "url": "https://schema.org/url",
+       "organization": "https://schema.org/worksFor",
+       "affiliation": "https://schema.org/affiliation"
+     },
+     "@type": "https://schema.org/person",
      "jobTitle": "Primary Investigator",
      "givenName": "Audson",
      "familyName": "Cargohlmuth",
      "email": "wohlgemuthjh@ornl.gov",
      "telephone": "865.576.7658",
      "url": "https://www.ornl.gov/staff-profile/jason-h-wohlgemuth",
      "organization": "Geospatial Science and Human Security Division",
      "affiliation": "National Security Sciences Directorate"
    }

πŸ“¦ Packages

To enable broad adoption of ACORN, we provide packages for multiple popular programming contexts.

graph LR
    A["acorn-lib</br>(Rust crate)"] -->|"&nbsp;is dependency of&nbsp;"|B["acorn-cli</br>(Rust crate)"]
    A -->|"generates</br>&nbsp;bindings for&nbsp;"|C["acorn-py</br>(Python package)"]
    A -->|"transpiles</br>&nbsp;API subset to&nbsp;"|D["acorn-web</br>(WASM package)"]

πŸ¦€ Rust crate

crates.io docs.rs Size Downloads

The ACORN CLI application is built on top of the acorn-lib Rust crate. The acorn-lib library provides core functionalities for working with ACORN schemas, validating persistent identifiers1, and generating artifacts.

Installation

  • Add acorn-lib as a dependency in your Cargo.toml2

    [dependencies]
    acorn-lib = "0.1.45"
    
  • Use acorn-lib functions in your Rust code

    #![allow(unused)]
    fn main() {
    use acorn::schema::validate::{is_ark, is_doi, is_orcid, is_ror};
    
    assert!(is_ark("ark:/1234/w5678"));
    assert!(is_doi("10.11578/dc.20250604.1"));
    assert!(is_orcid("https://orcid.org/0000-0002-2057-9115"));
    assert!(is_ror("01qz5mb56"));
    }
  • If you want to use the full capability of acorn-lib to build your own CLI app, be sure to include the following in your Cargo.toml to enable the doctor and powerpoint features3

    [dependencies]
    acorn-lib = { version = "0.1.45", features = ["doctor", "powerpoint"] }
    

  1. Persistent Identifiers (PIDs) supported by acorn-lib include DOIs, ORCIDs, RAiDs, RORs, and ARKs. ↩

  2. Check crates.io for the latest version ↩

  3. See the Cargo.toml for more details. ↩

🐍 Python Bindings

PyPI Version PyPI Downloads

ACORN seeks to meet scientists where they are in all aspects. This includes the programming languages they use. The acorn-py package provides Python bindings to the core ACORN functionalities provided by the acorn-lib Rust crate.

Installation

  • See the PyPI page for installation and usage instructions
  • Use acorn-lib functions in Python
    from acorn.schema.validate import is_ark, is_doi, is_orcid, is_ror
    
    assert is_ark("ark:/1234/w5678")
    assert is_doi("10.11578/dc.20250604.1")
    assert is_orcid("https://orcid.org/0000-0002-2057-9115")
    assert is_ror("01qz5mb56")
    

Working with scientific artifact identifiers

The acorn-py package provides tools to work with common scientific artifact identifiers such as DOIs, ARKs, ORCIDs, and RORs and other indirectly related identifiers such as patent numbers and books (e.g., ISBNs). You can validate these identifiers, work with their components, and even extract them from text!

Tip

See the acorn-lib documentation for more details on persistent identifiers and how to work with them in your code.

Example

Find all patent numbers in a string

🐍 Python

from acorn.schema.pid import Patent

text = "The patent number for my work is US1234567B1."
values = Patent.find_all(text)
patent = values[0]

assert str(patent) == "US 1234567 B1"
assert patent.country_code == "US"
assert patent.serial_number == "1234567"
assert patent.kind_code == "B1"

πŸ¦€ Rust

#![allow(unused)]
fn main() {
use acorn::schema::pid::Patent;

let text = "The patent number for my work is US1234567B1.";
let values = Patent::find_all(&text);
let patent = values[0];

assert_eq!(patent.to_string(), "US 1234567 B1");
}

API Consistency

The acorn-py API strives to adhere to the Rust API as closely as possible. If you know acorn-lib, you know acorn-py.

Validate DOIs

πŸ¦€ Rust

#![allow(unused)]
fn main() {
use acorn::schema::validate::DOI;

assert!("10.11578/dc.20250604.1".is_doi());
}

🐍 Python

from acorn.schema.validate import is_doi

assert is_doi("10.11578/dc.20250604.1")

Find all DOI values in a string

πŸ¦€ Rust

#![allow(unused)]
fn main() {
use acorn::schema::pid::DOI;

let pid = "https://doi.org/10.11578/dc.20250604.1";
let text = format!("The DOI for ACORN is: {pid}");
let values = DOI::find_all(&text);
assert_eq!(values[0].identifier(), "10.11578/dc.20250604.1");
}

🐍 Python

from acorn.schema.pid import DOI

pid = "https://doi.org/10.11578/dc.20250604.1"
text = f"The DOI for ACORN is: {pid}"
values = DOI.find_all(text)
assert values[0].identifier == "10.11578/dc.20250604.1"

ACORN in a Browser

Warning

This documentation is a work in progress. Some sections may be incomplete or subject to change.