πΊοΈ Overview
π± ACORN1 provides and operationalizes an ontology for research activity data (RAD) and enables adding linked data context and transforming RAD into a knowledge graph that is amenable to automated reasoning and artifact generation (e.g. PDFs, PPTX, etc.)
ACORN is a command line multi-tool that employs automated processes for informing and enforcing defined content schemas. With these content schemas, ACORN builds communication assets such as PDFs, presentation files, and web pages. It also lays the foundation for deep data insights about ORNLβs β and any institutionβs β corpus of research. Built using the memory-safe Rust programming language, ACORN can be used on any Windows, Mac, or Linux machine

-
ACORN stands for βAccessible Content Optimization for Research Needs β©
π€ So what? Big deal? Who cares?
Important
If you do science, you need ACORN.
π― What is ACORN trying to solve?
Accessible Content Optimization for Research Needs, ACORN, allows science achievers and communicators to create analysis-ready research activity data. ACORNβs associated schemas standardize how research is codified and communicated to capture the entire research architecture, including what research is being done, how itβs done, and how it all relates.
Built with memory-safe Rust, ACORN is a portable solution that uses statically typed and dynamically validated data structures. Schemas include unique identifiers1 to support open science principles and practices.
ACORN can help inform researchers, sponsors, and partners, as well as train machines on existing research projects β and identify where gaps in our research exist.
π How is it novel?
ACORN focuses on research at the project level. While there are other systems that track research projects tangentially β through people, publications, or organizations β there is no system that we know of that employs projects as the primitive source of information. We believe codifying research activity data at the project level is the key to unlocking deep insights about an organizationβs resources, funding, partnerships, and accomplishments. This network of linked data surpasses what may be already present but siloed in current search tools.
With existing tools, we can see
- publications2, but we donβt know about projects without journal publications or how projects relate,
- people3, but people move groups, change job roles, and leave organizations, and
- approvals and timelines, but these systems arenβt often designed to integrate with other PIDs or systems
- budgets, but budgets do not include the level of project details necessary to capture appropriate scientific understanding
ACORN does not replace or remove these systems. Each supports part of the puzzle. ACORN can help ensure they integrate and work together to provide users with valuable information from a central source of truth, no matter their entry point.
β οΈ What are the risks?
There are very few risks to ACORN. They would simply include anything that could preclude continued development of ACORN like lack of funding or human resources to support further development of the project. They also include any security risks inherent in projects at locations that host potentially sensitive data.
Even if it completely fails4, our βescalator turns to stairsβ and will never truly fail. It is open-source5, designed to be local and decentralized, and built on existing systems, with information stored in flat files so it will never be lost or unshareable.
Tip
Research activity data curated and controlled by ACORN and its associated schemas and processes will never be without value.
π Why should I use ACORN?
You should use ACORN if you want your project to be part of the research conversation. We initially developed ACORN for researchers, to help capture and communicate their projects more effectively. ACORN helps cut down on administrative burden, allowing the PI or a project designee to submit descriptive metadata to a form and receive automatically generated fact sheets in PDF, web, or PowerPoint form. This form, kept as a JSON file in a designated repository6, becomes the single source of truth for information on each project. We continue to develop the metadata schema and talk with research groups to benefit organization processes.
-
ACORN provides CURIEs and will soon also work with research activity identifiers (RAiD) β©
-
Realistically, the only mode of failure is complete lack of adoptionβ¦ β©
-
We call these repositories, βbucketsβ (see the page on buckets for details) β©
π Getting Started
Important
The most popular way to use ACORN is to not install it at all β simply leverage the power and promise of ACORN within a CI job (see this example
.gitlab-ci.yml)
Installation
Install with Scoop (Windows only)
Warning
π§ Under construction (Pull Request/Issue)
-
Ensure you have Scoop installed.
-
Open your terminal and add the extras bucket to Scoop
scoop bucket add extras -
Run the following command to install ACORN
scoop install acorn -
After installation, you can verify the installation by running:
acorn help
Install with Homebrew (Linux and MacOS)
Warning
π§ Under construction
Install with Cargo
-
Ensure you have Rust and Cargo installed. If not, you can install them using rustup.rs.
-
Open your terminal and run the following command to install ACORN
cargo install acorn-cli -
After installation, you can verify the installation by running:
acorn help
Run Inside a Container
-
Ensure you have Docker1 installed.
-
Pull the latest ACORN image with Docker1
docker pull savannah.ornl.gov/research-enablement/acorn/runner -
Run ACORN using Docker1
docker run --rm savannah.ornl.gov/research-enablement/acorn/runner help
Tip
You will probably need to mount a volume to use ACORN with your local files. For example:
docker run --rm -v $(pwd):/data savannah.ornl.gov/research-enablement/acorn/runner:latest check /data/project
Download pre-compiled binary
- Download the latest release from the ACORN GitHub Releases
-
πͺ Windows
-
Open a PowerShell terminal and run:
irm -OutFile acorn.exe -Uri https://code.ornl.gov/api/v4/projects/16689/packages/generic/x86_64-pc-windows-gnu/v0.1.50/acorn.exe -
Test the downloaded executable:
.\acorn.exe help
-
-
π§ Linux
-
Open a terminal and run:
curl -LO https://code.ornl.gov/api/v4/projects/16689/packages/generic/x86_64-unknown-linux-musl/v0.1.45/acorn -
Make the downloaded file executable:
chmod +x acorn -
Test the downloaded executable:
acorn help
-
-
π MacOS
π§ Under construction
-
Tip
Move the binary to a directory included in your systemβs
PATHfor system-wide access.
Install from source
-
Clone the ACORN repository
git clone https://code.ornl.gov/research-enablement/acorn.git cd acorn -
Install
acorncommandcargo install --path ./acorn-cli -
After installation, you can verify the installation by running:
acorn help
-
These instructions will work with any OCI-compliant container runtime, such as Docker or Podman. β© β©2 β©3
Concepts
ACORN is built around a few core concepts that help you understand how to use it effectively. This section will introduce you to these concepts and explain their significance in the context of ACORN and science.
Motivation
For the last few decades, scientific progress has been driven by publishing papers in, ideally peer reviewed, scientific journals. Publishing affords researchers with recognition, career advancement, and funding opportunities. This model has worked well for a long time, but it has some serious limitations that ACORN aims to address.
- Publish or perish1: Scientists and researchers often care more about being published than about sharing scientific knowledge. This is not without reason, as publications are a key metric for career advancement and funding. However, this can lead to a focus on quantity over quality, and a reluctance to share negative results or data that does not support a hypothesis.
- Reproducibility crisis2: Many scientific results are difficult or impossible to reproduce, leading to questions about their validity. This is compounded by the fact that many publications do not provide access to the underlying data or code used in the research.
- No clear way to demonstrate science: Publications, people, and budgets do not tell the whole story. Science needs a cross-domain standard to collect and communicate the full context of scientific endeavors, including data, code, methods, and results.
ACORN addresses these issues by reducing the administrative burden of codifying the details of a research project and providing an automated framework for the sharing and analysis of scientific knowledge. Predicated on the idea of applying βscience all the way downβ, ACORN applies rigorous scientific principles to the management and dissemination of scientific knowledge itself.
Research Enablement
The research enablement initiative at Oak Ridge National Laboratory (ORNL) aims to improve the way scientific research is conducted, shared, and evaluated. It is a team of developers, communication experts, and information scientists passionate about open science, transparency, and improving the researcher experience.
REI works toward its goals by providing:
- A cross-domain model of research activity data (RAD)
- A command-line application written in Rust -
acorn-cli - A Rust crate for working with RAD -
acorn-lib - A Python package for working with RAD -
acorn-py - A catalog of research activity data at ORNL - research.ornl.gov
πͺ£ Buckets
Research activity data is persisted in versioned folder collections called βbucketsβ. Each bucket contains a set of files and media assets that describe research activities. Buckets can be stored locally or in a remote repository, such as a GitLab or GitHub repository. For multiple examples, see ORNLβs buckets.
Buckets can be combined via a flat-file3 configuration file using the acorn CLI tool. This allows users to aggregate data from multiple sources and generate reports or other outputs based on the combined data.
Buckets are designed to be flexible and extensible - buckets do not require a cloud provider or expensive infrastructure to use. They can be stored in any version-controlled repository, such as GitLab, GitHub, or even a local file system.
Buckets are designed to enable federation and scaling while maintaining low level control over permissions and access. This allows organizations to share data across teams and departments while maintaining control over who can access and modify the data.
Tip
See the research enablement wiki for more information on buckets.
-
D. R. Grimes, C. T. Bauch, and J. P. A. Ioannidis, βModelling science trustworthiness under publish or perish pressure,ββ Royal Society Open Science, vol. 5, no. 1, p. 171511, Jan. 2018, doi: 10.1098/rsos.171511. β©
-
M. Baker, β1,500 scientists lift the lid on reproducibility,ββ Nature, vol. 533, no. 7604, Art. no. 7604, May 2016, doi: 10.1038/533452a. β©
-
A flat-file is a simple text file that contains data in a structured format, such as JSON or YAML. Flat-files are easy to read and write, and can be used to store configuration data for applications. β©
ACORN Schemas and Ontologies
ASPECT
A Scientific Prescription for the Efficient Classification of Technology
The ASPECT framework is a standardized methodology for classifying and describing technology components within the ACORN ecosystem. It provides a structured approach to defining the attributes and relationships of various technological elements, ensuring consistency and interoperability across different systems and applications.
ASPECT was designed with the goal of unifying our understanding of automation, AI/ML technology, and βclassicalβ software. We focus on βtechnologyβ instead of βAI/ML technologyβ because the latter is a subset of the former. Furthermore, focusing on AI/ML as the end goal is not fruitful or correct. In fact, doing so is backwards. AI/ML software is not novel in any meaningful sense. Even if it was, it would still be 100% predicated on the scientific principles of software.
In the context of technology, AI/ML and automation are the same.
Tip
Key Components
click arrow to expand or collapse
πΌ Portability
- Limited
- Source
- Containerized
- Installer
- Automated Installer
- WebAssembly
π€ Autonomy
- Manual
- Machine-assisted
- Human-as-primary
- Machine-as-primary
- Human-supervised
- Machine-only
π Maturity
Maturity uses an augmented version of technology readiness levels (TRL) and includes levels 1 through 9
π¦Ύ Motivity
- Type 0
- Type 1A
- Type 1B
- Type 2
πΎ Data
- Real or Synthetic
- Availability
- Modality
- Quality
π₯οΈ (Hardware) Resources
- CPU
- GPU
- TPU
- FPGA
- Quantum
- Neuromorphic
- Other
π― Task Classification
- Perceive
- Reason
- Project
Real-world Example
The ASPECT framework can be applied in various scenarios
- AI-driven Gravity Mapping
- πΌ Source (Level 1)
- π€ Machine-assisted (Level 1)
- π Developed (TRL 5)
- π¦Ύ Type 2
- πΎ Trained on Real, Unavailable, Silver quality, Textual modality data
- π₯οΈ GPU
- π― Perceive, Reason
πΌ Software Portability
Software portability refers to the ease with which software applications or components can be transferred and adapted to operate in different computing environments, platforms, or systems with minimal modification. This ASPECT attribute is crucial for ensuring that software can function effectively across diverse hardware configurations, operating systems, and cloud environments.
π€ Autonomy
In the context of ASPECT, βautonomyβ characterizes a technologyβs level of human-machine teaming and describes the adaptive bi-directional team interaction among humans and machines that augments human capabilities for improved outcomes. This attribute builds on prior work by the Society of Automotive Engineersβ six levels of driving automation and expands on ISO definitions to partition technology into distinct and employable categories.
Tip
ISO 229891 defines human-machine teaming as βintegration of human interaction with machine intelligence capabilities.β
Levels
Manual (HMT 0)
The execution of a simple script does not imply autonomy beyond manual operation. This level is characterized by the execution of a script where the deterministic outcome is fully known and controlled by the human operator.
Machine-assisted (HMT 1)
In this level, the machine provides assistance to the human operator in executing a task. The human remains in full control of the task execution, with the machine offering support or suggestions as needed. This might include an iterative script that augments the input during each iteration based on prior outputs.
Human as primary (HMT 2)
Machine as primary (HMT 3)
Human supervisor (HMT 4)
Machine only (HMT 5)
-
ISO/IEC 22989:2022 Information Technology β Artificial Intelligence Concepts and Terminology β©
π Maturity
Levels
- βGreenfieldβ research
- Basic research (βGoal-oriented researchβ, TRL 1)
- Technology concept (βProof of principleβ, TRL 2)
- Feasible (βSystems developmentβ, TRL 3)
- Developing (βProof of conceptβ, TRL 4)
- Prototype (βApplication developmentβ, TRL 5)
- Operational (βIntegrationsβ, TRL 6)
- Mission ready (TRL 7)
- Mission capable (βDeploymentβ, TRL 8)
Tip
Discussion
ASPECT leverages an augmented version of the maturity indication levels of technology readiness levels (TRL). Wikipedia defines TRLs as βa method for estimating the maturity of technologies during the acquisition phase of a program. TRLs enable consistent and uniform discussions of technical maturity across different types of technology.β TRLs are widely used across the U.S. federal acquisitions community to assess the maturity of a particular technology. Additionally, ASPECT incorporates work1 that adapted TRLs to directly address the particular nuances of machine learning systems. Additionally, ASPECT maturity levels work well with the capability maturity model, but go beyond maintenance processes and efficiency to characterize the maturity of the technology, in practice. Specifically, ASPECT openness and portability attributes create a well-defined understanding of a given technology.
Ultimately, the maturity attribute of ASPECT combines with the other ASPECT attributes to provide a holistic view of the system under consideration, beyond simply how long the technology has been around or how well maintained it is. Maturity in ASPECT reflects the degree to which a system has been tested, validated, and proven in real-world scenarios, as well as its readiness for deployment and integration into existing workflows.
-
A. Lavin et al., βTechnology readiness levels for machine learning systems,β Nat Commun, vol. 13, no. 1, p. 6039, Oct. 2022, doi: 10.1038/s41467-022-33128-9.
β©
π¦Ύ Motivity
Discussion
βMotivityβ describes a technologyβs ability to exert power over its environment. In the context of ASPECT, this involves how a given technology interacts with its environment and the degree of autonomy it possesses in performing tasks. As a model of interaction, motivity is built on task classification categories and is closely related to the human-machine teaming level. Motivity can also be viewed in terms of data binding - no binding, one-way binding, and two-way binding.
Motivity is a somewhat uncommon word. It was chosen for the ASPECT framework in part because deliberate ambiguity can foster conceptual depth. Rarity minimizes external preconceptions, enabling custom layering of meanings without the baggage of a widely used term.
Motivity has been defined in various contexts across philosophy, biology, and psychology, often emphasizing intrinsic capacity for motion or change, which makes sense considering its etymology, emphasizing an intrinsic ability rather than external force. Motivity uniquely captures an inherent βmotive powerβ or self-initiating force for change, aligning with a data modelβs bidirectional synchronization as an active, propulsive property rather than passive reactivity (which implies response) or linkage (structural connection).
Similar niche terms like βaffordanceβ in HCI gained traction despite initial obscurity and today offer rich, nuanced meanings.
Types
Motivity is logically partitioned into four distinct types based on the level of interaction with the environment. Motivity is built around the same tasks in task classification - perception, projection, and comprehension, which essentially act as proxies for one-way data binding (input), one-way data binding (output), and computation, respectively.
Type 0
Note
Example Digital model
Type 0 has no interaction with the environment. Type 0 agency can involve βcomprehensionβ (e.g., simulations) but also includes technology that does not (e.g., a hammer). Although comprehension is optional in every type, including Type 0, in the case of Type 0 technologies it is not a very interesting thing to consider.
Type 1A
Note
Example Data shadow1
Type 1A is arguably the most common form of technology, at least in the context of research. Although it is mildly ambiguous in interpretation, the use of an interactive notebook (e.g., Jupyter) to visualize the distribution of some data read from a CSV file would be considered as Type 1A since the technology reads input from the environment (a CSV file) and presents an aspect of the data visually. Technically, the associated image is a created artifact stored with zeroes and ones, but for the purpose of classification via purpose of agency, we consider such ephemeral artifacts as insufficient to justify Types 1B or 2.
Type 1B
Note
Example Cron job that emails status of long running calculation
Type 2
Note
Example Digital twin1
-
Y. K. Liu, S. K. Ong, and A. Y. C. Nee, βState-of-the-art survey on digital twin implementations,β Adv. Manuf., vol. 10, no. 1, pp. 1-23, Mar. 2022, doi: 10.1007/s40436-021-00375-w. β© β©2
πΎ Data
Warning
This documentation is a work in progress. Some sections may be incomplete or subject to change.
π₯οΈ Resources
Warning
This documentation is a work in progress. Some sections may be incomplete or subject to change.
π― Task Types
Warning
This documentation is a work in progress. Some sections may be incomplete or subject to change.
Citations
A selection of academic papers and resources that have influenced the development of ACORN
[1] E. Njor, M. A. Hasanpour, J. Madsen, and X. Fafoutis, βA Holistic Review of the TinyML Stack for Predictive Maintenance,β IEEE Access, vol. 12, pp. 184861-184882, 2024, doi: 10.1109/ACCESS.2024.3512860.
[2] Y. Yang et al., βA Survey of AI Agent Protocols,β Apr. 26, 2025, arXiv: arXiv:2504.16736. doi: 10.48550/arXiv.2504.16736.
[3] B. Liu et al., βAdvances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems,β Mar. 31, 2025, arXiv: arXiv:2504.01990. doi: 10.48550/arXiv.2504.01990.
[4] βAI Blindspot: A Discovery Process for preventing, detecting, and mitigating bias in AI systems.β Accessed: Jan. 24, 2023. [Online]. Available: https://aiblindspot.media.mit.edu/
[5] V. Gadepally et al., βAI Enabling Technologies: A Survey,β May 08, 2019, arXiv: arXiv:1905.03592. doi: 10.48550/arXiv.1905.03592.
[6] A. Jain, S. Sharma, and S. Duggal, βComparative Study of Various Process Model in Software Development,β 2013. Accessed: Jan. 24, 2023. [Online]. Available: semanticscholar.org
[7] Q. Hua et al., βContext Engineering 2.0: The Context of Context Engineering,β Oct. 30, 2025, arXiv: arXiv:2510.26493. doi: 10.48550/arXiv.2510.26493.
[8] N. D. Lawrence, βData Readiness Levels,β May 05, 2017, arXiv: arXiv:1705.02245. doi: 10.48550/arXiv.1705.02245.
[9] A. Fuller, Z. Fan, C. Day, and C. Barlow, βDigital Twin: Enabling Technologies, Challenges and Open Research,β IEEE Access, vol. 8, pp. 108952-108971, 2020, doi: 10.1109/ACCESS.2020.2998358.
[10] J. Gou, B. Yu, S. J. Maybank, and D. Tao, βKnowledge Distillation: A Survey,β Int J Comput Vis, vol. 129, no. 6, pp. 1789-1819, June 2021, doi: 10.1007/s11263-021-01453-z.
[11] D. Kreuzberger, N. KΓΌhl, and S. Hirschl, βMachine Learning Operations (MLOps): Overview, Definition, and Architecture,β May 14, 2022, arXiv: arXiv:2205.02302. doi: 10.48550/arXiv.2205.02302.
[12] M. Mitchell et al., βModel Cards for Model Reporting,β in Proceedings of the Conference on Fairness, Accountability, and Transparency, Jan. 2019, pp. 220-229. doi: 10.1145/3287560.3287596.
[13] E. Blasch, J. Sung, and T. Nguyen, βMultisource AI Scorecard Table for System Evaluation,β Feb. 07, 2021, arXiv: arXiv:2102.03985. doi: 10.48550/arXiv.2102.03985.
[14] F. Yu, H. Zhang, and B. Wang, βNatural Language Reasoning, A Survey,β Mar. 26, 2023, arXiv: arXiv:2303.14725. doi: 10.48550/arXiv.2303.14725.
[15] S. Zhao, Y. Yang, Z. Wang, Z. He, L. K. Qiu, and L. Qiu, βRetrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely,β Sept. 23, 2024, arXiv: arXiv:2409.14924. Accessed: Oct. 02, 2024. [Online]. Available: arxiv.org
[16] Y. K. Liu, S. K. Ong, and A. Y. C. Nee, βState-of-the-art survey on digital twin implementations,β Adv. Manuf., vol. 10, no. 1, pp. 1-23, Mar. 2022, doi: 10.1007/s40436-021-00375-w.
[17] Center for Security and Emerging Technology and B. Buchanan, βThe AI Triad and What It Means for National Security Strategy,β Center for Security and Emerging Technology, Aug. 2020. doi: 10.51593/20200021.
[18] J. M. Bradshaw, R. R. Hoffman, D. D. Woods, and M. Johnson, βThe Seven Deadly Myths of βAutonomous Systems,ββ IEEE Intelligent Systems, vol. 28, no. 3, pp. 54-61, May 2013, doi: 10.1109/MIS.2013.70.
[19] M. R. Endsley, βToward a Theory of Situation Awareness in Dynamic Systems. Human Factors Journal 37(1), 32-64,β ResearchGate, Aug. 2025, doi: 10.1518/001872095779049543.
Command Line Reference
βββββββββ βββββββββ βββββββ βββββββββββ ββββββ βββββ
βββββββββββ βββββββββββ βββββββββββ βββββββββββββ ββββββββ βββββ
ββββ ββββ βββ βββ βββ βββββ ββββ ββββ ββββββββ ββββ
ββββββββββββ ββββ ββββ ββββ βββββββββββ βββββββββββββ
ββββββββββββ ββββ ββββ ββββ ββββββββββββ ββββ βββββββββ
ββββ ββββ βββββ ββββββββ βββ ββββ ββββ ββββ βββββββ
βββββ βββββ βββββββββββ βββββββββββ βββββ βββββ βββββ ββββββββ
βββββ βββββ βββββββββ βββββββ βββββ βββββ βββββ βββββ
β β β β β β β β ~ Accessible Content Optimization for Research Needs ~β β β β β β β β β β β β β β β β
Usage: acorn [FLAGS] [COMMAND]
COMMANDS:
check Perform static analysis on research activity data and apply standardized best practices
doctor Diagnose and correct system requirements for using acorn
download Download research activity data from buckets
export Export research activity data to a specific target
format Formats research activity data in place (inherently includes some elements of `acorn check`)
link Add linked data context to research activity data
schema Print research activity data (RAD) or research activity identifier (RAiD) metadata JSON schema to stdout
help Print this message or the help of the given subcommand(s)
FLAGS:
-X, --offline
Prevent communication with the internet - intended for disconnected local environments
Note: Use of --offline may require extra configuration options for certain commands
-t, --threads <N>
Limit number of threads used by rayon for parallel processing
See Rayon documentation for more information
[default: 0]
-v, --verbose...
Increase logging verbosity
-q, --quiet...
Decrease logging verbosity
-V, --version
Prints version information
-h, --help
Print help (see a summary with '-h')
Configuration
Flat-file
ACORN can be configured using JSON or YAML format files. These configuration files allow users to specify various options and settings for the ACORN CLI tool, including input and output directories, logging levels, and other parameters.
Example
.acorn.json file
{
"buckets": [
{
"name": "bessd",
"repository": {
"provider": "gitlab",
"id": 17603,
"uri": "https://code.ornl.gov/research-enablement/buckets/bessd"
}
},
{
"name": "ccsd",
"repository": {
"provider": "gitlab",
"id": 17602,
"uri": "https://code.ornl.gov/research-enablement/buckets/ccsd"
}
},
{
"name": "nssd",
"repository": {
"provider": "gitlab",
"id": 17410,
"uri": "https://code.ornl.gov/research-enablement/buckets/nssd"
}
}
]
}
Tip
.acorn.jsonis the default configuration file name that ACORN looks for in the current working directory. You can also specify a different configuration file using the--config <FILE>flag when running ACORN commands.
.env file and Environment Variables
The ACORN CLI tool can also be configured using .env files and/or environment variables.
Example
.env file
ACORN_LOG_LEVEL=info
READABILITY_METRIC=ari
MAX_ALLOWED_ARI=12
Commands
- π΅οΈββοΈ Audit
- β Check
- π¨ββοΈ Doctor
- π₯ Download
- π€ Export
- π€ Format
- πΈοΈ Link
Command Workflow
stateDiagram
direction LR
[*] --> RAD: create
RAD --> check
check --> RAD
check --> format
format --> link
format --> export
π΅οΈββοΈ Audit
Audit research activity data for completeness, consistency, and adherence to best practices - provide results in tabular format for easy review
Warning
This documentation is a work in progress. Some sections may be incomplete or subject to change.
β Check
Perform various checks on associated research activity data
Example Usage
# Check a specific research activity index
acorn check path/to/project/index.json
# Check all research activity data in a directory
acorn check path/to/project/
Categories of Checks
- ποΈ Schema Validation: Ensure that all data files conform to the expected schema
- β¨ Prose Quality: Analyze written content for grammar, spelling, and more
- π Readability: Evaluate the readability of textual content using established metrics1
- π Link Integrity: Verify that all hyperlinks within the content are valid and reachable
- π Data Consistency: Check for consistency and completeness in datasets
- π¦ Convention Adherence: Ensure that naming conventions and organizational standards are followed
Customization Options
The check command supports several flags and options to customize its behavior (e.g., skipping certain checks, disabling certain behaviors, etc.)
Include --exit-on-first-error to stop execution upon encountering the first error.
Bypass verifying the checksum of downloaded artifacts with --skip-verify-checksum2
Skip checks
--skip schema: Skip schema validation checks--skip prose: Skip prose quality checks--skip readability: Skip readability checks--skip schema,prose: Skip both schema validation and prose quality checks (works for any combination of categories)--disable-website-checks: Disable all website-related checks (link integrity, etc.)
Note
--disable-website-checksis redundant whenacorn --offlineis used for commands that need to be run in offline environments.
Configure Readability
Readability can be configured by passing options directly to the command line or via a .env file. Command line options override .env settings.
--readability-metric <METRIC>: Specify which readability metric to use- Set
READABILITY_METRICin your.envfile to choose the readability metric. Default metric isfkgl(Flesch-Kincaid Grade Level). - Set
MAX_ALLOWED_FKGLin your.envfile to define the maximum acceptable FKGL score. Each metric has its own corresponding maximum score variable (e.g.,MAX_ALLOWED_ARIfor Automated Readability Index).
Example .env file
Configure ACORN to use the Coleman-Liau Index (CLI) readability metric with a maximum allowed score of 14.0 (default value is 12.0):
READABILITY_METRIC=cli
MAX_ALLOWED_CLI=14.0
-
See the readability module documentation for a full list of available readability metrics. β©
-
β οΈ Skipping checksum verification may expose you to security risks. Use this option with caution. β©
π¨ββοΈ Doctor
Diagnose and fix issues with host environment to enable ACORN functionality
Warning
This documentation is a work in progress. Some sections may be incomplete or subject to change.
π₯ Download
Obtain files from an ACORN bucket to your local filesystem
- See the configuration documentation for details on configuring ACORN commands.
- By default,
downloadwill save files to./contentin the current working directory unless an output path is specified via the--outputflag.
Example Usage
# Download research activity data from a list of buckets
acorn download --config /path/to/.acorn.json
# Download research activity data to a specific output directory
acorn download --config /path/to/.acorn.yml --output /path/to/output
Local vs Remote
The download command copies files from a local ACORN bucket when local file:// URIs are used for associated buckets in the configuration file. Use "git" as the provider for local buckets.
"buckets": [
{
"name": "test (local)",
"repository": {
"provider": "git",
"location": "file:./tests/fixtures/data/bucket/"
}
},
{
"name": "nssd (remote)",
"repository": {
"provider": "gitlab",
"location": {
"scheme": "https",
"uri": "https://code.ornl.gov/research-enablement/buckets/nssd"
}
}
}
]
GitLab vs GitHub
The download command supports both GitLab and GitHub remote repositories for ACORN buckets. The configuration for each is similar, with the main difference being the provider field in the repository object.
"buckets": [
{
"name": "ccsd (gitlab)",
"repository": {
"provider": "gitlab",
"id": 17410,
"uri": "https://code.ornl.gov/research-enablement/buckets/ccsd"
}
},
{
"name": "test (github)",
"repository": {
"provider": "github",
"uri": "https://github.com/jhwohlgemuth/bucket"
}
}
]
π€ Export
Export research activity data to various formats for analysis and sharing
This command allows you to export research activity data from ACORN into different formats such as PDF, Markdown, YAML, or PPTX. With export, you can easily share your research with sponsors, collaborators, or the general public in a variety of contexts.
export supports exporting individual research activity indices or entire directories containing multiple indices.
ACORN allows one to maintain research activity data as persistent interconnected single sources of truth, which allows one to easily create a variety of output artifacts while ensuring consistency and accuracy across all selected formats.
graph LR
data("JSON</br>(index.json)") --> export{Export}
export --> PDF("PDF</br>(fact sheet)")
export --> PPTX("PPTX</br>(presentation)")
export --> more("More to come...</br>(Markdown, YAML, etc.)")
Tip
You can see the result of the export command in action by visiting the ORNL Research Activity Index, which features a variety of research activity data presented in different formats.
Example Usage
# Export research activity data to PDF fact sheet
acorn export /path/to/index.json --format pdf
# Create (single slide) PowerPoint presentations from all research activity data in a directory
acorn export /path/to/project/ --format powerpoint
PowerPoint Reference Template
You can customize the PowerPoint export by providing a reference template using the --reference option. This allows you to define specific styles, layouts, and branding for your presentations.
acorn export /path/to/index.json \
--format powerpoint \
--reference /path/to/reference.pptx
The reference template allows you to specify which and where certain values are used using placeholder text in the format {{ PLACEHOLDER_NAME }}. During export, ACORN will replace these placeholders with the corresponding data from the associated research activity data. You can find an example PowerPoint reference template used for testing in the ACORN GitLab repository.
Available Placeholders
The following placeholders can be used in your PowerPoint reference template:
String values
caption- First image captionchallenge- Challenge descriptioncitation- DOI citationemail- Contact emailfirst- Contact first namefocus- Research focus arealast- Contact last namemanagers- Manager names (joined with"and")missionnotes- Presentation notes (intended to be added PowerPoint speaker notes)partners- Partner names (joined with", ")programs- Program names (joined with"and")subtitletitle
Lists (bullet points)
achievementareas- Research areasimpacttechnical- Technical approach
π€ Format
Auto-fix and format research activity data (RAD) files to maintain consistency, resolve values, and improve prose quality
Example Usage
# Format a specific research activity index
acorn format path/to/project/index.json
# Format all research activity data in a directory
acorn format path/to/project/
# Preform dry-run to see proposed changes without modifying files
acorn format path/to/project/index.json --dry-run
Example Output
meta:
keywords:
- - automatin
+ - automation
technology:
- - JavaScript
- - TypeSpec
- astro
+ - javascript
- react
- - rs
+ - rust
+ - typespec
sponsors:
- - DOD
+ - Department of Defense
...
contact:
jobTitle: Primary Investigator
givenName: Jasdrey
familyName: Wohlson
email: me@example.com
- telephone: '(123) 456-7890'
+ telephone: '123.456.7890'
url: https://www.ornl.gov/staff-profile/jason-h-wohlgemuth
- organization: GSHS
+ organization: Geospatial Science and Human Security Division
+ affiliation: National Security Sciences Directorate
Features
- π οΈ Auto-fixing: Automatically fix common inconsistencies in RAD structure and prose
- π¨ Consistent Formatting: Ensure consistent JSON formatting across all data files
- π©Ή Resolve Values: Resolve certain values against controlled vocabularies to ensure meaning is conveyed correctly
- πΌοΈ Resolve missing images: Find first image in associated RAD folders and add to metadata if missing
Resolved Values
meta.keywords: Resolve keywords against ACORN Keywords Vocabularymeta.technology: Resolve technology against ACORN Technology Vocabularymeta.partners: Resolve partner names against ACORN Partners Vocabularymeta.sponsors: Resolve sponsor names against ACORN Sponsors Vocabularycontact.organization: Resolve organization name against a given org chart (currently only supports ORNL org chart)contact.affiliation: Resolve organization name against a given org chart (currently only supports ORNL org chart)
Examples
- Keywords:
"ai"β resolves to"artificial-intelligence" - Partners:
"NREL"β resolves to"National Renewable Energy Laboratory" - Technologies:
"rs"β resolves to"rust" - Sponsors:
"Dept. of Energy"β resolves to"Department of Energy"
πΈοΈ Link
Add linked data context to research activity data and create
JSON-LDdocuments
The link command augments research activity data (RAD), as provided by a human user, with linked data context and outputs JSON-LD documents. This process involves mapping the input data to established ontologies and vocabularies, thereby enhancing its interoperability and semantic richness.
ACORN enables ingesting RAD in a sort of βpre-compactedβ format. The link command makes the input RAD machine-readable, ready for programmatic expansion.
Tip
Linked Data empowers people that publish and use information on the Web. It is a way to create a network of standards-based, machine-readable data across Web sites.
Example Usage
# Link a specific research activity index
acorn link path/to/project/index.json
# Link all research activity data in a directory
acorn link path/to/project/
Tip
The
linkcommand is almost identical in usage to theformatcommand, complete with support for the--dry-runflag to preview changes without creating files.
Example Output
"contact": {
+ "@context": {
+ "jobTitle": "https://schema.org/jobTitle",
+ "givenName": "https://schema.org/givenName",
+ "familyName": "https://schema.org/familyName",
+ "identifier": "https://orcid.org",
+ "email": "https://schema.org/email",
+ "telephone": "https://schema.org/telephone",
+ "url": "https://schema.org/url",
+ "organization": "https://schema.org/worksFor",
+ "affiliation": "https://schema.org/affiliation"
+ },
+ "@type": "https://schema.org/person",
"jobTitle": "Primary Investigator",
"givenName": "Audson",
"familyName": "Cargohlmuth",
"email": "wohlgemuthjh@ornl.gov",
"telephone": "865.576.7658",
"url": "https://www.ornl.gov/staff-profile/jason-h-wohlgemuth",
"organization": "Geospatial Science and Human Security Division",
"affiliation": "National Security Sciences Directorate"
}
π¦ Packages
To enable broad adoption of ACORN, we provide packages for multiple popular programming contexts.
graph LR
A["acorn-lib</br>(Rust crate)"] -->|" is dependency of "|B["acorn-cli</br>(Rust crate)"]
A -->|"generates</br> bindings for "|C["acorn-py</br>(Python package)"]
A -->|"transpiles</br> API subset to "|D["acorn-web</br>(WASM package)"]
π¦ Rust crate
The ACORN CLI application is built on top of the acorn-lib Rust crate. The acorn-lib library provides core functionalities for working with ACORN schemas, validating persistent identifiers1, and generating artifacts.
Installation
-
Add
acorn-libas a dependency in yourCargo.toml2[dependencies] acorn-lib = "0.1.45" -
Use
acorn-libfunctions in your Rust code#![allow(unused)] fn main() { use acorn::schema::validate::{is_ark, is_doi, is_orcid, is_ror}; assert!(is_ark("ark:/1234/w5678")); assert!(is_doi("10.11578/dc.20250604.1")); assert!(is_orcid("https://orcid.org/0000-0002-2057-9115")); assert!(is_ror("01qz5mb56")); } -
If you want to use the full capability of
acorn-libto build your own CLI app, be sure to include the following in yourCargo.tomlto enable thedoctorandpowerpointfeatures3[dependencies] acorn-lib = { version = "0.1.45", features = ["doctor", "powerpoint"] }
-
Persistent Identifiers (PIDs) supported by
acorn-libinclude DOIs, ORCIDs, RAiDs, RORs, and ARKs. β© -
See the Cargo.toml for more details. β©
π Python Bindings
ACORN seeks to meet scientists where they are in all aspects. This includes the programming languages they use. The acorn-py package provides Python bindings to the core ACORN functionalities provided by the acorn-lib Rust crate.
Installation
- See the PyPI page for installation and usage instructions
- Use
acorn-libfunctions in Pythonfrom acorn.schema.validate import is_ark, is_doi, is_orcid, is_ror assert is_ark("ark:/1234/w5678") assert is_doi("10.11578/dc.20250604.1") assert is_orcid("https://orcid.org/0000-0002-2057-9115") assert is_ror("01qz5mb56")
Working with scientific artifact identifiers
The acorn-py package provides tools to work with common scientific artifact identifiers such as DOIs, ARKs, ORCIDs, and RORs and other indirectly related identifiers such as patent numbers and books (e.g., ISBNs). You can validate these identifiers, work with their components, and even extract them from text!
Tip
See the acorn-lib documentation for more details on persistent identifiers and how to work with them in your code.
Example
Find all patent numbers in a string
π Python
from acorn.schema.pid import Patent
text = "The patent number for my work is US1234567B1."
values = Patent.find_all(text)
patent = values[0]
assert str(patent) == "US 1234567 B1"
assert patent.country_code == "US"
assert patent.serial_number == "1234567"
assert patent.kind_code == "B1"
π¦ Rust
#![allow(unused)]
fn main() {
use acorn::schema::pid::Patent;
let text = "The patent number for my work is US1234567B1.";
let values = Patent::find_all(&text);
let patent = values[0];
assert_eq!(patent.to_string(), "US 1234567 B1");
}
API Consistency
The acorn-py API strives to adhere to the Rust API as closely as possible. If you know acorn-lib, you know acorn-py.
Validate DOIs
π¦ Rust
#![allow(unused)]
fn main() {
use acorn::schema::validate::DOI;
assert!("10.11578/dc.20250604.1".is_doi());
}
π Python
from acorn.schema.validate import is_doi
assert is_doi("10.11578/dc.20250604.1")
Find all DOI values in a string
π¦ Rust
#![allow(unused)]
fn main() {
use acorn::schema::pid::DOI;
let pid = "https://doi.org/10.11578/dc.20250604.1";
let text = format!("The DOI for ACORN is: {pid}");
let values = DOI::find_all(&text);
assert_eq!(values[0].identifier(), "10.11578/dc.20250604.1");
}
π Python
from acorn.schema.pid import DOI
pid = "https://doi.org/10.11578/dc.20250604.1"
text = f"The DOI for ACORN is: {pid}"
values = DOI.find_all(text)
assert values[0].identifier == "10.11578/dc.20250604.1"
ACORN in a Browser
Warning
This documentation is a work in progress. Some sections may be incomplete or subject to change.