Part 1

 

ABAQUS INP COMPREHENSIVE ANALYZER

Under the Hood  —  A Deep-Dive Series

 

PART 1

Model Processing

From File Open to Fully Parsed Assembly

 

Joseph P. McFadden Sr.

McFaddenCAE.com  |  The Holistic Analyst

 

© 2026 Joseph P. McFadden Sr. All rights reserved.


 

Series Introduction — Why We Open the Box

Welcome.

 

My name is Joe McFadden. I have been doing CAE work — computer-aided engineering, which means finite element analysis — since 1979. That is not a credential I drop to impress you. It is context. It means I have spent decades inside simulation tools, and for most of that time, those tools were black boxes.

You put a model in. Numbers came out. What happened in the middle was largely hidden. You either trusted it or you did not, and there was not much in between.

That bothered me then. It still bothers me. And it is the reason this program exists, and the reason you are reading this series.

The Abaqus INP Comprehensive Analyzer is a free, open-source Python tool that reads Abaqus input files — the .inp files — and breaks them apart, piece by piece, without requiring an Abaqus license to do it. But this series is not just a tour of what buttons do what.

This is a walk through the logic. The thinking. The decisions the code makes at every step, and why it makes them.

 

I designed this series for three audiences.

First, for engineers who use the tool and want to understand what is happening under the hood — not to distrust it, but to be able to reason with it.

Second, for people who want to build their own tools and need a concrete example of how to approach a complex parsing and analysis problem from scratch.

And third, for anyone who simply refuses to accept a black box. That is the most important group to me, because that is the mindset that leads to real understanding.

 

This first part covers model processing — everything that happens from the moment you open a file to the moment the program has a complete, analyzed picture of your assembly sitting in memory. We will go step by step. We will talk about what the code reads, how it reads it, what it is looking for, and what it does when it finds it.


 

Section 1 — The File Selector and the Starting Gun

The first thing the program does when you launch it is load a graphical interface built on Python's built-in tkinter library. Tkinter is not glamorous. It is not a modern web-based framework. It is a thin wrapper around the Tk toolkit that has been part of Python for decades. But it is cross-platform, it requires no installation, and for a desktop engineering tool that needs to run on Windows machines without any special setup, it is exactly right.

The interface opens with three key controls at the top: a file path entry box, a Browse button, and a Process button.

 

Notice that the Process button starts disabled. This is a deliberate design choice. You cannot process a file you have not selected, and the code enforces that. The Browse button opens a file dialog filtered to show .inp and .cae files. When you select a file and click Open, the path populates the entry box, and only then does the Process button activate.

This is a small detail, but it illustrates a principle that runs through the entire codebase: the program tries to prevent you from doing something incorrect before you do it, rather than waiting to fail after.

 

When you click Process, the program does something that might seem overly elaborate for what appears to be a simple button click. It does not just run the analysis. It spawns a separate background thread.

Here is why. Python's tkinter interface runs on what is called the main thread — the single execution path that draws the window, responds to mouse clicks, and keeps the application alive and responsive. If you run a long computation on that same thread, the entire window freezes. You cannot move it, you cannot click Cancel, and the operating system eventually marks it as unresponsive.

So the processing logic is handed off to a daemon thread — a background worker. The main thread stays free to do one thing: watch a progress dialog, check every hundred milliseconds whether the background thread has finished, and respond to a Cancel button if you need one.

 

That progress dialog with the indeterminate progress bar — the one that bounces back and forth — is the main thread's way of telling you that work is happening. The status message inside it updates as the background thread sends signals forward through a thread-safe queue. This is a common pattern in GUI programming, and understanding it is important for anyone building tools that need to stay responsive during heavy computation.


 

Section 2 — The File Reader and the Include Recursion

The first real task, once the background thread starts, is reading the file. This sounds simple. Open the file, read the lines, done.

But Abaqus input files are not always a single file.

 

Abaqus supports a keyword called *INCLUDE. When the Abaqus preprocessor encounters this keyword, it pauses reading the current file and starts reading a separate referenced file instead — then returns and continues. This lets large models be broken into pieces: a node file here, an element file there, a materials file somewhere else.

A naive file reader would miss this entirely. It would read the keyword and move on, losing all the data in the referenced files.

The program handles this with a function called read_inp_with_includes. This function does not just read the file. It reads recursively.

 

Here is how it works. The function accepts a file path and maintains two things: an output list, and a set of visited file paths. The visited set is how it prevents infinite loops — if two files somehow reference each other, the recursion stops.

For each line in the file, the function checks whether the line matches the pattern for a INCLUDE directive. It uses a compiled regular expression for this — a pattern that looks for the text INCLUDE, then INPUT=, then a file name.

If a match is found, the function resolves the referenced path relative to the directory of the current file, then calls itself on that new path. This is recursion — the function calling itself with a new argument.

If the referenced file is found, its lines are inserted into the output at exactly the right position. If it is not found, a warning comment is added to the output and processing continues. Resilience is intentional.

 

Every line that comes out of this function carries two pieces of information: the line content itself, and the source file it came from. This is important for debugging. If something unexpected shows up in the analysis, the source tracking tells you exactly which file that data came from — even if it was three levels of recursion deep.

The output of this function is a flat list of tuples: text and source, text and source, all the way through the model. From this point forward, the entire rest of the program works from this flat list. The file structure is gone. What remains is a clean, linear sequence of every meaningful line in the model.


 

Section 3 — Three Patterns That Do All the Heavy Lifting

Before we talk about parsing, we need to talk about the tools the parser uses. The entire structural intelligence of this program — its ability to distinguish a keyword from a data line, a parameter name from a parameter value — rests on three compiled regular expressions.

 

Regular expressions are pattern-matching rules for text. They are written in a compact notation that tells the computer exactly what to look for in a string. Compiled regular expressions are those same rules that have been pre-processed into an internal format that runs much faster during the millions of comparisons that happen during parsing.

Pattern One — The Include Detector

The first pattern matches *INCLUDE lines. It looks for optional whitespace, then a literal asterisk and the word INCLUDE, then optional whitespace, then an equals sign, then the file name. The case-insensitive flag means it catches INCLUDE, Include, and include all the same.

This pattern is applied to every line during the file reading phase. It runs before anything else.

Pattern Two — The Keyword Detector

The second pattern is the most important in the entire program. It identifies Abaqus keyword lines — lines that start with an asterisk, followed by a keyword name made of letters, numbers, spaces, underscores, or hyphens.

In the Abaqus input format, keywords always start with an asterisk. This is the structural grammar of the format. A line that starts with an asterisk is a command. A line that does not start with an asterisk is data for whatever command came before it.

This single distinction — asterisk line versus data line — is the foundation of every parsing decision the program makes.

 

The keyword pattern captures everything between the asterisk and either the first comma or the end of the line. That captured group is the keyword name, which the parser immediately converts to uppercase for consistent matching regardless of how it was written in the file.

Pattern Three — The Parameter Extractor

Keywords in Abaqus are almost always followed by parameters on the same line, separated by commas. The format is: keyword, name=value, name=value, and so on.

The third pattern extracts these name-value pairs. It looks for a comma, optional whitespace, a parameter name, optional whitespace, an equals sign, optional whitespace, and then the value extending up to the next comma.

Because this pattern uses the findall method — which returns all non-overlapping matches in a string — a single call against a keyword line returns every parameter at once. The result is immediately converted into a dictionary keyed by uppercase parameter names.

 

So when the parser encounters a line like *SOLID SECTION, ELSET=BodyElements, MATERIAL=Steel6061 — the keyword pattern extracts the text SOLID SECTION, the parameter pattern extracts a dictionary with ELSET equal to BodyElements and MATERIAL equal to Steel6061, and both are available immediately for routing and storage.

That is three lines of pattern logic standing between raw text and structured, queryable data.


 

Section 4 — The First Pass: Building the Model Summary

With the file fully read and the parsing tools ready, the first major function to run is called parse_inp_summary. This is the high-level reconnaissance pass. It reads the entire file once and builds a broad statistical picture of the model.

 

Think of it like doing a census before you start studying a population. You are not yet trying to understand relationships or calculate anything. You are counting and categorizing — taking inventory.

The function initializes a large dictionary before it even looks at the first line. This dictionary is the container for everything the summary pass will collect. Let me walk you through what is in it.

There is a count of nodes, a count of elements, and a counter — a specialized dictionary — that tracks how many elements of each type exist in the model. This is where you learn that your model has, say, forty thousand C3D8R elements and two thousand C3D10M elements.

There is a set of material names, a list of section definitions, counts of element sets and node sets, counts of surface definitions, tie constraints, contact pairs.

There is a step count and a detailed list of step definitions, including what type of analysis each step performs, what time period it runs to, and what incrementation parameters it uses.

There are lists for amplitude definitions, gravity and distributed load records, initial conditions, predefined fields, mass scaling settings, bulk viscosity settings, and hourglass control configurations.

There is a flag for whether General Contact is active. There is a list of assembly instances and their transformations.

The State Machine Concept

All of this comes from a single linear pass through the file. The logic is a state machine.

A state machine is a system that can be in one of several states at any time, and that transitions from one state to another based on inputs it receives.

In this parser, the state is tracked by three variables: the current keyword, the current material, and the current element type. There is also a boolean — a true or false flag — that tracks whether the parser is currently inside a step definition.

 

Here is the fundamental logic. For each line in the file:

First, skip it if it is blank or starts with two asterisks. Double-asterisk lines are Abaqus comments. They carry no structural meaning.

Second, try to match the line against the keyword pattern. If it matches, update the state — set the current keyword, extract parameters, and dispatch to the appropriate logic for that keyword.

Third, if the line does not match the keyword pattern, it is a data line. Handle it based on whatever state is currently active.

 

Here is what happens for each major keyword.

When the parser sees *NODE, it sets the current keyword to NODE. Every subsequent data line that starts with a digit is counted as one node.

When it sees *ELEMENT with a TYPE parameter, it sets the current element type. Every subsequent data line is counted as one element, and the element type counter for that type increments.

When it sees *MATERIAL with a NAME parameter, it adds that name to the materials set. Material names are unique — sets do not allow duplicates — so the same material referenced multiple times only appears once.

When it sees STEP, it initializes a new step object, stores the step name and step number, checks whether the perturbation flag is present, and sets the in-step flag to true. When it later sees END STEP, it finalizes that step object, pushes it to the step list, and clears the flag.

Inside a step, keywords like STATIC, DYNAMIC EXPLICIT, DYNAMIC IMPLICIT, FREQUENCY, BUCKLE, and STEADY STATE DYNAMICS each set the step type and add themselves to the global list of simulation types in use. The time period and incrementation parameters come from the first data line following the procedure keyword.

 

When the summary pass is complete, the program has a complete census. It knows how large the model is, what kind of simulation it runs, how it is connected, and what physics it attempts to solve. That information drives everything that comes next.


 

Section 5 — The Second Pass: Material DNA and Section Mapping

The first pass gave us a count. The second pass — handled by the function parse_material_section_part_and_props — gives us relationships.

This pass is where the program extracts what I call the material DNA of the model. It maps which materials are assigned to which sections, which sections belong to which parts, and what the actual numerical property values are for every material.

 

This function runs another linear pass through the same flat file list. It maintains a more complex set of state variables.

There is a part stack — a list that grows and shrinks as the parser moves in and out of PART and END PART blocks. The stack structure is important: it handles nested context. When you are inside a part, the stack has that part's name at its top. When you exit the part, the name pops off.

There is a current material variable and a current property variable. There are section and element-set ownership dictionaries.

Tracking Part Ownership

When the parser sees PART with a NAME parameter, it pushes that name onto the part stack. Every section and element set encountered while the stack is non-empty is tagged as belonging to that part. When END PART appears, the name pops off.

This lets the program track which parts own which element sets without any explicit labeling — the structure of the file itself carries that information through scope.

Section Records

When the parser sees SOLID SECTION, SHELL SECTION, or *MEMBRANE SECTION, it creates a section record. That record captures the section type, the raw keyword line, the current part from the stack, and extracts the ELSET and MATERIAL parameters.

For shell sections and membrane sections, the thickness is not on the keyword line. It is on the first data line that follows. So when the parser is tracking a shell or membrane section and the current section record has no thickness yet, the very next data line is treated as a thickness value and parsed accordingly. Once the thickness is captured, the section record is considered complete.

Material Property Extraction

When the parser sees *MATERIAL with a NAME parameter, it sets the current material. Every keyword that follows — ELASTIC, DENSITY, EXPANSION, CONDUCTIVITY, SPECIFIC HEAT, PLASTIC, DAMPING — is recognized as a material property keyword. Each one creates a new property record attached to the current material.

The data lines that follow are parsed as rows of floating-point numbers. Each number is converted from its text form using Python's float function. If conversion succeeds for every value on a line, the row is added to the property data. If any value fails to convert — perhaps it is a flag or a string token — the row is discarded. The parser is conservative: only clean numerical data is kept.

 

This is how you end up with a material that has, for example, an ELASTIC property with twelve rows of data — Young's modulus and Poisson's ratio at twelve different temperatures. That is temperature-dependent elastic data, and the program captures every row.

Building the Material-Section-Part Map

At the end of the second pass, the program does a final assembly step. It loops through all the section records it collected and builds a nested map: for each material, which sections reference it, and for each section, which parts are associated with it.

The association comes from two sources. First, if the section record carries a part name from the stack, that part is directly associated. Second, if the section's ELSET parameter matches a key in the element-set ownership dictionary, the owning parts from that dictionary are also added.

The result is the material-section mapping — a dictionary where each material name points to a list of section entries, and each entry includes the section type, the element set, the associated parts, and the thickness if applicable.

 

This mapping is the engine behind the Materials tab, the Sections tab, and the Parts tab. It is also the input to part identification, which is the next major step.


 

Section 6 — Node Coordinates and the Geometry Foundation

While the first two passes built the structural and material picture, the geometry — the actual spatial coordinates of every point in the mesh — is captured in a dedicated third pass called parse_node_coordinates.

 

A finite element model is fundamentally a set of points in space connected by elements. The points are nodes. Each node has an identifier and three coordinates: X, Y, and Z.

In a simple model, node IDs are unique across the entire file. But in a structured model — one where multiple parts are defined with *PART blocks — each part defines its own nodes with its own numbering. Two different parts can both have a node number one, and those are completely different points in space.

 

This is the node ID collision problem, and it is one of the most important challenges in multi-part model processing.

The program addresses it by maintaining two parallel storage structures. One is a flat dictionary keyed by integer node ID for simple models. The other is a dictionary keyed by part-node tuples — a pair containing the part name and the node ID — for structured models.

As the parser moves through the file, it tracks the current part context using the same PART and END PART pattern as before. When a node is encountered inside a part block, it is stored under the tuple key. When a node is encountered outside any part block, it is stored under the integer key.

 

The node coordinate record for a given node looks like this: a three-element tuple of floating-point numbers representing X, Y, and Z. The line itself is parsed by splitting on commas, taking the second through fourth values — skipping the first, which is the node ID — and converting each to a float.

The program also handles a special Abaqus feature: the *NODE GENERATE keyword, which defines a range of nodes with regular spacing. This is less common but the parser recognizes and handles it.

When node parsing is complete, every spatial point in the model is in memory, correctly keyed so that later computation can find any node's coordinates regardless of whether the model uses simple numbering or part-scoped numbering.


 

Section 7 — Element Connectivity: How Nodes Become Structure

Nodes are points. Elements are the connections between them. The element connectivity pass — parse_element_connectivity — reads the *ELEMENT blocks and builds the complete topology of the mesh.

 

Each element has three pieces of information: an ID, a type, and a list of node IDs that form its corners. The type is critical because it determines how many nodes there are and how they are arranged.

A C3D8R is an eight-noded brick element — a cube-like solid. A C3D4 is a four-noded tetrahedron. A C3D10M is a ten-noded quadratic tetrahedron — four corner nodes plus six midside nodes for curved-edge capability. A C3D20R is a twenty-noded quadratic brick. S4R is a four-noded shell. M3D4R is a four-noded membrane.

 

The element type determines something critical for parsing: the expected number of node entries per element. A C3D8R has eight nodes. If those eight IDs fit on one line, the element definition is one line. If they do not — which is common for quadratic elements with ten or twenty nodes — the definition continues onto the next line.

Handling Multi-Line Element Definitions

This is one of the places where the parser has to be smart about what constitutes a complete record. A naive line-by-line parser would see the element start on one line, see what looks like a new record on the next, and corrupt the connectivity table.

The program handles this by knowing the expected node count for each element type. When it starts reading an element, it collects node IDs until it has the expected count. If the current line does not provide enough IDs, the parser looks to the next line for continuation — as long as that next line does not start with an asterisk, which would indicate a new keyword.

This continuation logic was the subject of a bug fix in version 15.7. Large models with multi-line element definitions — particularly C3D10M and C3D20R meshes — were previously causing a list index out of range error during STL export because some elements were being stored with incomplete node lists. The fix added defensive validation: any element that does not have the expected number of nodes after parsing is flagged and excluded from geometry operations, rather than causing the entire export to fail.

Part Context and ELSET Tagging

Just as with nodes, element parsing tracks part context. Elements encountered inside a *PART block are tagged with the part name. This part tag is stored on every element record.

Elements can also carry an element-set tag — the ELSET parameter on the *ELEMENT keyword line. This becomes the element's membership identifier in the set-based grouping system.

Each element record that comes out of this function is a dictionary with four fields: the element ID, the element type string, the list of node IDs, and optionally the part name and element set name.

 

When element parsing is complete, the program has the full mesh topology in memory: every node's location in space, and every element's list of the nodes it connects. This is sufficient to calculate volumes, tessellate surfaces, perform penetration checks, and export geometry.


 

Section 8 — Part Identification: The Three-Tier Logic

We have nodes. We have elements. We have materials. We have sections. Now comes the question that sits at the heart of multi-part model analysis: which elements belong to which part?

 

This question sounds simple, but the answer depends entirely on how the model was built. Abaqus supports multiple modeling workflows, and those workflows produce input files with very different structures. The program handles three distinct cases.

Tier One — Structured Models with Explicit Part Definitions

The cleanest case. The input file contains explicit *PART blocks, each with a NAME parameter. Every element inside a part block carries a part tag that was applied during the connectivity pass. Grouping is direct: collect all elements with the same part tag, and you have the part.

The function identify_parts_from_inp_parts handles this case. It groups elements by their part tag, looks up section and material information from the mapping, and builds a part record that includes the element list, the material name, the section type, the thickness if applicable, and a flag marking this as a true structured-part record.

Tier Two — Orphan Meshes with Material-Section Grouping

An orphan mesh is a mesh that was exported from a CAD or meshing tool without preserving the original part structure. The *PART blocks are gone. Node and element IDs may be global and non-overlapping, but the concept of a part has been lost.

In this case, the program reverse-engineers parts from the material-section structure. The logic is: any group of elements that share the same material and the same section definition probably constitutes a distinct physical component.

The function identify_parts_by_material_section loops through the section records, builds a mapping from element-set names to section and material information, then groups elements by element set. Groups that share the same material-section combination are merged into a single part. The resulting parts are named sequentially: Part-1, Part-2, Part-3, and so on.

 

This is not perfect. Two physically separate components with identical materials and section types will be grouped as one part. But in practice, most models have enough material or section variation to produce useful separations, and the program makes its methodology transparent so you can evaluate whether the result makes sense.

Tier Three — ELSET-Based Part Naming

Some exported models — particularly those from certain meshing workflows — carry part names embedded in their element-set names using a specific convention: the set name contains the part name followed by an at-sign and an instance identifier.

The program detects this pattern and uses it to apply meaningful part names to what would otherwise be anonymous numbered parts. This is the ELSET-based naming path, and it can handle assemblies with well over a hundred named parts. The part naming validation dialog — which appears after processing — shows you exactly which names came from ELSET declarations, which came from reverse engineering, and whether there are any discrepancies.

The Dispatcher

The top-level function identify_parts makes the decision between tiers one and two by checking a single flag on the elements: if any element in the model carries a non-empty part tag, the model is structured and tier one applies. If none do, tier two applies.

Tier three operates on top of whichever identification method ran first, applying better names where it can find them.

 

There is an important subtlety here that any developer building a similar tool needs to understand. In structured models, you cannot filter elements by ID alone. Two parts can both have an element with ID one hundred. If you search by ID without first filtering by part name, you get the wrong element. This is the element ID collision problem, parallel to the node ID collision described earlier. The program solves it by always looking up elements through the part context first.


 

Section 9 — Model Type Detection and the Orphan Mesh Dialogue

Part identification tells us which elements belong to which parts. But the program also runs a separate, lighter-weight classification pass called detect_model_type that categorizes the overall model structure.

 

This function scans the raw file lines — not the parsed data, but the original text — and counts three things: how many *PART blocks there are, how many section definitions there are, and how many material definitions there are.

From those three numbers, it makes a classification decision.

If there are zero *PART blocks, the model is classified as an orphan mesh. The confidence is medium, because it is possible to have a valid single-part model with no explicit PART block.

If there is exactly one *PART block and more than one section or material, the model is classified as an orphan mesh with high confidence — the single PART block was likely added by an export tool and does not represent genuine structured assembly.

If there is exactly one *PART block with one section and one material, the model is classified as simple — a single-component model.

If there are two or more *PART blocks, the model is classified as structured with high confidence.

 

This classification drives the user experience after processing.

For structured models, the program parses the original part list from the file and offers a dual-view switcher in the Parts tab: you can see the parts as they were declared in the file, or you can see the parts as identified by material and section analysis. This comparison is often informative — it reveals whether the ELSET-based naming and the file-based naming agree.

For orphan meshes, the program shows a confirmation dialog before proceeding. This dialog explains what an orphan mesh is, notes that part names may be generic rather than meaningful, and asks you to confirm that you want to continue. This is the anti-black-box philosophy in action: rather than silently making assumptions, the program surfaces them.

A model banner at the top of the Parts tab then shows the detection result so it is always visible as you work.


 

Section 10 — Unit System Detection: The Detective Work

Abaqus does not enforce units. You can put any number you want in any field. The solver has no idea whether your modulus of two hundred thousand means two hundred thousand pascals or two hundred thousand megapascals. It just uses the number. The responsibility for consistency is entirely on the analyst.

 

This is one of the most dangerous aspects of finite element modeling, and one of the most common sources of invisible errors. A model with internally consistent units but the wrong unit system will produce results that are off by orders of magnitude — and if you do not know what answer to expect, you may not notice.

The program addresses this with a unit detection system built from two independent approaches: material library matching and value range analysis.

Material Library Matching

The program maintains an internal library of common engineering materials — steel, aluminum, titanium, FR4 circuit board material, SAC305 solder, various plastics, magnesium, brass, and more — with their property values tabulated for each of ten unit systems.

Ten unit systems. Let me name them so they are concrete. There are the SI systems: SI in meters, SI in millimeters, and SI in millimeters with tonnes for mass rather than kilograms. There are the time-based systems using milliseconds — grams, millimeters, and milliseconds; and tonnes, millimeters, and milliseconds. There are the imperial systems: inches, pounds-force, and seconds; and feet, pounds-force, and seconds. There are also mixed systems.

 

The detection function extracts Young's modulus and density from each material in the model. It then compares those values against the library entries for every material across all ten unit systems. The comparison is not a binary match — it is a tolerance-based proximity score. If your steel's modulus matches the steel library entry for the millimeter-tonne-second system within a percentage threshold, that is evidence for that unit system.

Evidence accumulates across all materials. The system with the highest total evidence score wins, and the result is presented as a confidence percentage.

There is a threshold check: if the best match is under twenty percent confidence, the library-based result is not reliable and the program falls back to the range-based approach.

The Ambiguity Flag

A critical feature added from external AI review — specifically Grok and Perplexity analysis — is the relative gap check between the top two unit systems. If the top scorer and the second scorer are very close to each other, the program flags the result as ambiguous rather than presenting a false confident answer.

The most important ambiguity case to understand is grams-millimeters-milliseconds versus tonnes-millimeters-seconds. Both unit systems happen to use the same Young's modulus values for common materials. The difference that distinguishes them is density — which differs by a factor of one million between the two systems. A model with only modulus data and no density data is genuinely undetectable from the library alone. The program surfaces this ambiguity explicitly rather than guessing.

The User Confirmation Dialog

Whatever the detection result, the program does not simply apply a unit system silently. It presents a dialog showing the detected system, the confidence level, which materials matched and to what, and a list of all ten unit systems to choose from.

You confirm the result or override it. If you skip the step entirely, the program records that no unit system was selected and proceeds without unit-aware labeling — and it warns you that this is not recommended.

The selected unit system then drives how results are displayed throughout the interface: mass in kilograms or grams or pounds-mass, volume in cubic millimeters or cubic inches, and so on.


 

Section 11 — Volume and Mass: Gaussian Quadrature Under the Hood

With parts identified, nodes located, elements connected, and a unit system selected, the program can calculate volumes and masses. This is more involved than it might sound.

 

For a regular geometric shape — a cube, a cylinder, a sphere — volume has a closed-form formula. But a finite element mesh is an irregular collection of polyhedral cells, each with eight or more nodes at potentially arbitrary positions in space. There is no single formula for the volume of an arbitrary hexahedral or tetrahedral element.

The method the program uses is Gaussian quadrature. This is a numerical integration technique that evaluates a function at a set of specific sample points — called Gauss points — and combines those evaluations with weights to approximate the integral over the element domain.

 

For volume, the integrand is simply the Jacobian determinant — a mathematical quantity that represents how local space within the element is scaled relative to a standard reference element. The reference element is a perfect unit cube or unit tetrahedron in a fictitious coordinate system. The real element is the distorted version of that reference in physical space. The Jacobian tells you how much volume a tiny unit of reference space corresponds to in physical space.

Integrate the Jacobian over the reference element using Gauss quadrature, and you get the physical volume of the element.

 

For an eight-noded brick — the C3D8 family — the reference domain uses two Gauss points in each of three directions, giving eight sample points total. For a four-noded tetrahedron, four Gauss points are used. For quadratic elements with midside nodes, more points and higher-order shape functions are required.

The program implements vectorized calculations — using NumPy array operations rather than Python loops — to evaluate the Jacobian across all elements of a given type simultaneously. This is the optimization credited in the code comments to the Gemini advisory review, and it produces roughly one hundred to one thousand times speedup on large models compared to element-by-element Python loops.

Mass Calculation and the Density Conversion

Mass is volume multiplied by density. But there is a trap here.

Density in Abaqus models can be expressed in different ways depending on the unit system. In the grams-millimeters-milliseconds system, a typical steel density is approximately 7.8 × 10⁻³ grams per cubic millimeter. In the tonnes-millimeters-seconds system, the same steel has a density of approximately 7.8 × 10⁻⁹ tonnes per cubic millimeter.

Those two numbers differ by a factor of one million. The program applies a density conversion check: if the raw density value is greater than 10⁻⁵, it is assumed to be in grams per cubic millimeter and is converted to tonnes per cubic millimeter by multiplying by 10⁻⁶. If it is already below that threshold, it is assumed to be in the correct form.

This is a heuristic — an educated assumption based on typical material property ranges. It will be wrong for exotic materials with unusually high or low densities. Understanding this heuristic, rather than assuming the conversion is universal, is part of why this series exists.

 

There is also a filtering step before volume calculation. Not all elements in a model are geometric. MASS elements, SPRING elements, and DASHPOT elements are point or connector entities with no volume. The program identifies and skips these before summing volumes, rather than letting them corrupt the result.

The validation of volume results against independent tools like SolidWorks is how we confirmed the correctness of the calculation. The benchmark comparison was the ground truth.


 

Section 12 — Interface Nodes, Part Relationships, and Interaction Detection

Once volumes and masses are calculated, the program moves into relationship analysis. It asks: which parts are physically connected to which other parts, and how?

Interface Nodes

The first step is finding interface nodes — nodes that are shared between two or more parts. In a well-meshed assembly, two parts that are bonded together at a surface share the same node IDs at that surface. If part A and part B both list node 500 in their element connectivity, that node sits on the interface between them.

The function find_interface_nodes builds a dictionary from node ID to the list of parts that use that node. Any node appearing in more than one part's list is an interface node. The count of shared nodes between any two parts is a measure of their contact area.

Interaction Detection

Physical connections also show up explicitly in the input file as Abaqus interaction keywords: TIE, CONTACT PAIR, GENERAL CONTACT, COUPLING, *MPC. The parse_interactions function scans for these and extracts the names of the surfaces or sets involved, which can then be traced back to parts.

This is more complex than node sharing because surface names and set names are indirect references. The program builds as complete a picture as the naming conventions in the file allow.

The Relationship Graph

The function build_part_relationships combines node sharing and explicit interactions into a relationship graph. For each pair of parts, it records whether they share interface nodes, how many nodes they share, whether they appear together in any interaction definition, and whether they likely interact mechanically based on both signals together.

This graph feeds the nearest-parts search, the penetration check, and the part interaction visualization in the interface. It is what allows the user to select a part and immediately see which other parts are in contact with it.

 

An important performance note: the penetration check — which detects geometric overlap between parts — was upgraded from a brute-force comparison to a K-D tree algorithm. A K-D tree is a spatial data structure that partitions points in three-dimensional space and allows nearest-neighbor queries in logarithmic time rather than linear time. The difference between an O(n×m) algorithm and an O(1) lookup for each point is what makes the penetration check practical on large assemblies.


 

Section 13 — Part Name Validation and the ELSET Cross-Check

Version 15.5 introduced a validation step that compares two independent sources of part name information: the declared names from ELSET-based naming in the input file, and the detected names produced by the reverse-engineering analysis.

 

The function validate_part_names builds two sets: the declared set, which comes from parsing element set names for the part-name@ID convention, and the detected set, which comes from the identify_parts logic. It then computes the intersection — parts found by both methods — and the two differences: parts declared but not detected, and parts detected but not declared.

If the intersection is complete and both differences are empty, validation passes. If there are discrepancies, the program reports them and presents a choice.

The validation dialog lets you choose whether to use the declared names, the detected names, or cancel loading entirely. This is not a decision the program makes for you. You see the data, you evaluate the discrepancies, and you choose.

 

This cross-check is particularly valuable for large assemblies with many parts. An assembly with 176 parts — which is a real case this program has been tested on — has a lot of opportunities for naming inconsistencies. Surfacing them before analysis rather than allowing them to corrupt downstream results is exactly the kind of quality gate that separates a professional tool from a script that assumes everything is fine.


 

Section 14 — Simulation Intent Classification

The final step of model processing — introduced in version 15.7 — is automatic simulation intent classification. After the full parse is complete and all parts are identified, the program scores the model against seventeen engineering simulation purposes.

 

The seventeen categories include: drop test, crash and impact, quasi-static structural, modal and frequency analysis, thermal analysis, thermomechanical coupling, buckling, fatigue, creep and relaxation, hyperelastic behavior, composites, connector and joint analysis, acoustic analysis, contact mechanics, manufacturing process simulation, and more.

The scoring uses an evidence chain: each simulation type has a set of indicators it looks for in the parsed model data. The presence of an EXPLICIT step, for example, is a strong indicator of a drop test or impact simulation. The presence of *FREQUENCY with a perturbation flag is a strong indicator of modal analysis. Temperature-dependent material properties are evidence of thermal or thermomechanical analysis.

 

Multiple indicators for a given category accumulate score. The final output is a ranked list of classification candidates with confidence levels, the top classification shown prominently, and the full evidence chain for each candidate visible on request.

The accept-reject-redirect workflow then gives you three choices. Accept the classification and proceed with purpose-specific best-practice recommendations calibrated to that simulation type. Reject it and provide the correct purpose, at which point the program performs a gap analysis — what features are present for the stated purpose, what is missing, and what conversion steps would bring the model into alignment. Or redirect, moving to a different classification.

For QA tracking and documentation, the evaluation results can be exported as a report.

 

This feature auto-launches after successful processing. It is not a gatekeeping step — you can dismiss it and proceed — but it is always there as a first-pass sanity check that can catch a model running the wrong procedure type before analysis is even attempted.


 

Section 15 — The Complete Processing Sequence, End to End

Here is the complete sequence from beginning to end, so the full pipeline is clear.

 

1.  You browse to your .inp file and the Process button activates.

2.  You click Process. A progress dialog opens. A background thread starts.

3.  The background thread calls read_inp_with_includes. The file is read recursively, following any *INCLUDE directives. Every line is paired with its source file. The output is a flat list.

4.  parse_inp_summary runs on the flat list. Node counts, element counts, material names, section counts, step definitions, contact data, amplitude definitions, boundary conditions — all collected in a single pass using the keyword state machine.

5.  parse_material_section_part_and_props runs on the flat list. Material property tables are extracted. Section records are built with thickness data. Part ownership of element sets is tracked through the part stack. The material-section-part mapping is assembled.

6.  parse_node_coordinates runs on the flat list. Node coordinates are stored with part-scoped keys for structured models and integer keys for orphan meshes.

7.  parse_element_connectivity runs on the flat list. Multi-line element definitions are handled. Each element record captures ID, type, node list, part tag, and element set tag.

8.  identify_parts runs. The dispatcher checks for part tags on elements, routes to structured or orphan-mesh identification, and builds the parts dictionary.

9.  validate_part_names compares declared ELSET names to detected part names and presents the validation dialog if discrepancies exist.

10. detect_model_type classifies the model as structured, orphan mesh, or simple. The appropriate confirmation dialog is shown.

11. detect_and_confirm_units extracts material property values, runs them against the material library for all ten unit systems, assesses confidence, and presents the unit confirmation dialog.

12. calculate_all_parts computes volume via Gaussian quadrature for every part. Densities are converted if needed. Non-geometric elements are filtered. Mass is calculated.

13. find_interface_nodes identifies shared nodes across parts. parse_interactions extracts explicit contact and constraint definitions. build_part_relationships assembles the relationship graph.

14. The interface updates: the Summary tab fills with model statistics, the Parts tab populates with the identified assembly, the Materials tab shows property data, the Sections tab shows section details.

15. The simulation intent classifier launches automatically and scores the model against seventeen simulation purposes.

 

That is the complete processing sequence. From a text file on disk to a fully analyzed, visualized, labeled assembly in memory — with every decision visible, every assumption surfaced, and every result available for independent verification.


 

Closing — The Point of All of This

I want to close by returning to why we built this the way we did.

 

Every step in that pipeline that surfaces information to the user — the orphan mesh confirmation, the unit detection dialog, the part validation cross-check, the intent classification — is there because the alternative is to make an assumption silently. And silent assumptions in engineering analysis are how you get wrong answers that look right.

 

This tool was built as an open-source reference implementation of a complete CAE preprocessing pipeline. Every function described in this document is readable, modifiable, and extensible. The code is not hidden. The logic is not locked. If you want to add a new unit system, a new material to the library, a new element type, a new simulation classification — you can. You now know where to put it.

 

The multi-AI development approach — using Claude as the primary implementation partner and routing review work to Grok, Gemini, ChatGPT, and Perplexity as independent technical advisors — produced specific, traceable improvements that are documented in the version history. The vectorized volume calculation. The K-D tree penetration check. The Poisson's ratio unit detection enhancement. The confidence gap flagging. Each one came from a deliberate review cycle, not from hoping the first implementation was good enough.

 

That is a model for how to develop complex technical software: build it, expose it to critical review from multiple independent perspectives, implement the improvements, and document what changed and why.

 

In the next part of this series, we will go into the three-dimensional geometry engine — how the program tessellates element surfaces into triangles, how it builds the STL representation, how the 3D viewer renders your assembly, and what the surface normals and winding order mean for visualization quality.

 

You can find the tool, the documentation, and the companion readers for this series at McFaddenCAE.com.

 

 

 

End of Part 1 — Model Processing

Next: Part 2 — The Geometry Engine: Tessellation, STL Export, and the 3D Viewer

 

© 2026 Joseph P. McFadden Sr. All rights reserved.  |  McFaddenCAE.com

Next
Next

Part 2