First and foremost, let it be known that I am not a formally educated software development engineer or data scientist, however, I have completed an online course on data science by John Hopkins University. Does that mean I know what I’m talking about? Probably not.

With that being said, I’ll leave it up to you, the reader, to decide if this concept makes sense. If it doesn’t, I welcome you to challenge the idea and educate me on why it won’t work.

I readily admit that there are many things in life that I don’t have the faintest idea of what I’m talking about.
Steve Jobs

Years ago when I was only a fledgling BIM Specialist, I had a dream… That all data regarding a facility’s assets would be stored in it’s Building Information Model (BIM). This would include asset data, warranty information, maintenance history, trended energy usage, and more. Today, many are referring to this concept as a digital twin.

Fast forward to today where I’m now BIM Manager on the owner/client side of the AECO space. We’ve implemented a “top-down” approach to BIM standards, meaning we provide our architectural and engineering consultants with a library of Revit content for use at the project level. This ensures that the project models generate the data we require for our internal stakeholders.

Many of these stakeholders have requested that we add data to our Revit families to help them do their jobs more efficiently. This would typically make sense, however with the maturity level of our program (we are in our terrible two’s), things are changing so frequently that our BIM data is becoming extremely expensive to manage. This extraneous data introduces an uncontrolled input and forces project teams to keep up with all of the changes that are funneled through our Revit families.

After a year of dealing with dozens of program revisions and hundreds of change management issues, I’ve learned that the less data we store in our Revit families, the better.

The Minimum Viable Data Approach

BIM Managers operating at the corporate program level should consider an approach that I call Minimum Viable Data (MVD). This concept suggests that you provide the least amount of data possible in your Revit families to mitigate the time and money associated with maintaining that information. The more information defined in your models, the more data you have to maintain. So to that end, you can reduce cost by simply providing less data in client-provided Revit families.

I know this is probably the opposite of how you’ve been trained to think about BIM (especially if you are on the architecture, engineering, or construction side of the business). Furthermore, BIM Managers who represent clients/owners often have transitioned from design firms, which means they carry the mindset of “the more BIM data, the better.”

Read on for my reasons why I feel that more data can be an extremely costly BIM standard, especially for new programs which change frequently. The lessons learned in my first year as BIM Manager on the client/owner side have lead me to the MVD approach and luckily I am taking steps to lean out my BIM data before we are at scale.

1.

Projects get hit with additional fees with every program revision, even if it’s only as simple as updating a Shared Parameter value.

The more data you provide to your project teams, the more it costs to keep that information up to date.

For context, I should note that my organization is a giant tech conglomerate with a $1.5T market cap as of this writing. I mention this not to brag (maybe a little), but rather to demonstrate the scale that I am currently operating at. Additionally, I am the BIM Program Manager of a new program which means lots and lots of pivots and changes.

In our program, Revit is the genesis of most data related to the assets within our facilities. We include the obvious data points like manufacturer, model numbers, vendors, descriptions, and engineering data. This information makes sense to store as BIM data because it is visible in our construction documents.

This is where I try to draw a line in the sand. Ideally, I prefer to only include data in BIM models that help generate drawings, otherwise the design phase carries the costs of maintaining that data. If this data isn’t related to design or construction, why should these teams manage it?

A Simple Example of The Cost Impact of a Parameter Change

In Revit families, a Shared Parameter can be defined as a URL (i.e., a website address) which is typically used to provide a link to a manufacturer’s website. This link might lead to a web page which hosts the cut sheet and other information about the asset represented in the Revit model.

If I were to add this data point to my Revit families and publish it to our projects, what would happen if the manufacturer decided to change their website address?

For the sake of this example, let’s say a change in this parameter impacted 20 Revit families, triggering a revision to each one of them. Now, this is a relatively simple task (especially with all of the bulk editing tools available these days), but my point is not to say that modifying parameter data is the problem. Not even close. The problem is pushing that change out to all of our projects in flight.

If it takes an architect one hour to update these 20 families they’ll bill us for that change. Let’s say they charge us $150 per hour. That is definitely not a huge cost impact to the project seeing as it’s less than I spend on my Amazon Prime membership per year. However, now scale that process up to 1,000 projects and we just spent $150,000 to update a Shared Parameter… And to top that off, it’s a Shared Parameter that brings absolutely no value to the design process.

We could even take it a step further and forecast what it would cost to make 10 different changes across 1,000 projects. You can check my math, but I believe we are now at a hefty $1.5 million. Try to explain that cost to the richest man in the world who instills frugality into his business model wherever possible.

This is just one example of how implementing the MVD method can eliminate the cost of managing extraneous data, particularly if it brings no value to the design phase of a project. I realize that most families don’t include the URL Parameter, but think about where this example might apply to your own BIM program.

2.

Revit is not a database.

Many AEC folks will argue that Revit is a database. Well, the tech sector might disagree with you.

I often hear architects and engineers refer to Revit models as being databases. Well, that’s technically not true. I agree that BIM models contain structured data, but does that make it a database? Perhaps the Techopedia definition will shed some light on the matter:

“A database (DB), in the most general sense, is an organized collection of data. More specifically, a database is an electronic system that allows data to be easily accessed, manipulated and updated.”
Techopedia

If you didn’t catch the part of the definition which proves Revit is not a database, here it is again:

…allows data to be easily accessed, manipulated and updated.

Now ask yourself, “is this true for Revit?” Is it easy to access and update the data within a Revit model? That’s a hard “no” from my perspective because it feels more like the data in a Revit model is locked in a dungeon with Autodesk charging you for the keys every year.

Once data is stored in a Revit model, it can only be accessed and managed by a subset of people who understand how to use the proprietary software to update Parameters. Sure, one could export the data or build a custom application to make the data easier to manage, but does that really solve the problem of this type of encapsulation?

Now try to think of a time when someone in your organization asked you to add a Shared Parameter in your Revit models to help them with their bill of materials (BOM), standalone software, calculations, or other process. This is where you should pause and decide if it is worth it to pick up the cost for managing that data throughout the lifecycle of that asset. Again, any updates to this data will hit the project budget of the design phase as explained earlier in this post.

Perhaps the worst part of storing this extraneous data in Revit models is that only a small subset of your team can access and modify it. Once that data is defined in Revit, only team members who are trained in Revit and have licenses for the software will have the ability to view and manage it.

My point is, the inability to access the data encapsulated in a Revit model creates a bottleneck in processes. You don’t want to know how many times I’ve been asked something along the lines of, “can you open the Revit model and tell me what the Paramter value is for x and y?”

The question should never be, “can we do this in Revit,” but rather, “should we do this in Revit?”

Relational Databases: Don’t put all of your eggs data in one basket bucket.

It’s not that we need less data overall, it’s about where specific data should be stored.

The MVD approach doesn’t propose the use of less data, but rather that extraneous data should be stored external of Revit to allow the proper owners to access and manage it as they see fit. By giving these non-Revit stakeholders the ability to manage their data external of Revit, there will be absolutely zero impact at the project level when changes occur (unless they are related to the design and construction phase). Now think back to my earlier example of the $150k cost to update a Shared Parameter. A change like this would have absolutely no impact at the project level if the data was managed in an external database.

Forming a Relationship Between Revit Elements and an External Database is Easier Than You Think

One of the most popular data models in computer science is the relational database (RDB). In laymen’s terms, an RDB is a database which stores data in multiple tables and allows the data to be associated to each other.

“A relational database organizes data into tables which can be linked—or related—based on data common to each. This capability enables you to retrieve an entirely new table from data in one or more tables with a single query.”
IBM

With the MVD approach, we borrow this concept from the techies and implement a simple solution for BIM.

How does a relational database work?

In the example below, we see what the data structure might look like for a customer who has made a purchase. Note that there are three tables 1) customer, 2) invoice, and 3) product.

Records in these tables are easily associated with records in other tables by simply referencing their primary keys. In the figure above, we see each object has an ID which is passed to the others thus creating a relationship with nothing more than a text value. The primary key of a record is all that is required to associate the data in another table.

“Jay, what is your point?”

My point is, I don’t see why BIM data shouldn’t be relational as well. Implementing this concept can be as simple as assigning a unique identifier to each type of asset across your program. As long as that ID is unique across your program, you can use it as a key in an infinite number of external databases (or spreadsheets for that matter).

And that is the point I’m driving at here. MVD doesn’t suggest that we don’t need vast amounts of data, it suggests keeping data out of a Revit project and in an external database whenever possible.

Relational BIM Data

Now let’s look at how this concept can be applied to BIM data. In this example, the asset represents a retail display fixture which sells merchandise.

The data structure above is an extreme example of MVD implementation, but it demonstrates absolutely zero duplicate data entry. By eliminating duplicate data, you will save time and reduce risk of data becoming out of sync.

Note that the left side of the graph represents a single Revit Family Type. This asset is extremely lean and does not contain any information except a description which is what a designer would need when laying out a space.

Looking to the right side of the graph, there a few tables which are external of the Revit model. These are examples of data points which other stakeholders within the organization require for their scope of work.

Similar to the customer purchase sample, we can create relationships with external data with nothing more than a text Parameter to identify the asset.

If you need to import some of the external data into the Revit model, you could simply push that data into the Revit model programmatically using the OpenDefinery API.

Thoughts?

If you can give me a solid reason why we should store all of our asset data in a single Revit model, then by all means… Change my mind. Otherwise, read on for some tips on implementation.

The Asset ID Standard

Implementing a low-tech relational database is easier than you think. In fact, you may not even need an actual database.

To implement the concept of a primary key in Revit, you only need to 1) add a type parameter to your families for each Asset ID, and 2) build your external database with each record (i.e., a row in Excel or CSV) referencing the Asset ID as the key.

It’s as simple as that. There is no additional software to implement and no automation tools to build. Although the data is in two seemingly decoupled data sets, you can easily key them to each other using nothing more than that single Shared Parameter.

Using an Open Standard

Whenever possible, I highly recommend adopting an existing Shared Parameter rather than creating a new one. OpenDefinery Collections will help you navigate through the multiple Shared Parameter standards that already exist throughout the industry, and I recommend finding a Shared Parameter to clone and implement in your program.

Which Shared Parameter should you use?

Note that the AssetIdentifier Shared Parameter in The NBS BIM Object Collection is intended to be used at the facility level for an instance of an asset. This is not the ID that we want for our primary key at the program level because it will be unique per project.

I have started out very own OpenDefinery Collection and included a Shared Parameter named Asset_Id. This is the Parameter that I’ve implemented in my program and if you’re starting with a clean slate, this is might be “a GUID place to start”.

If you have a Shared Parameter that you’ve already implemented, please share it on OpenDefinery and let us know in the comments below.

Implementation Notes

The primary key should be the same Shared Parameter that is shown on your schedules and floor plans to identify the asset. Moving forward, it will be the key that is referenced by all stakeholders throughout the project lifecycle (e.g., procurement, marketing, merchandising, etc.) And it does not help if it is not the key that all stakeholders reference.
You may come across teams within your organization who assign their own unique identifier for assets. Don’t add this to your families or you’ll fall into the traps described earlier in this article. Stakeholders should map their identifiers to each other’s identifiers as needed, creating a relationship which keys into the primary key of the assets in your Revit models.

What do you think?

Will the Minimum Viable Data concept work for your program or do you prefer to prescribe a more robust set of Shared Parameters in your Revit families? I’d like to hear more about how other BIM Managers are running their program from the top-down, so please leave me a comment if you have any thoughts.

Jay Merlan

Jay is a Revit Certified Professional and has over 15 years of experience as a BIM manager at multiple design and construction firms. In his previous role as the Sr. BIM Manager at one of the largest tech conglomerates in the world, his primary goal was to streamline the design and construction processes by implementing a top-down approach to BIM standards which was critical to owner/operator workflows. Today, Jay is now spinning up his own startup, Triple Zero Labs, and is focused on creating tools for design automation and data management with BIM at its core.