In a previous post, I discussed the way that adjectives have been replacing adverbs, and pondered about what had happened to "ly". For example:
Drive Safe
rather than:
Drive Safely
I had quite a bit of feedback on this, both on and offline. Language discussions are always busy. But another similar trend came up in a discussion that I recently took part in.
A friend asked that if you used the term:
Data Ingestion
was the opposite:
Data Exgestion
Now I know that most words that change an in to an opposite usually use e and not ex. For example ingress and egress (rather than exgress). This means that it would be more likely to be:
Data Egestion
Given that Ingestion normally refers more to food than anything else, it's hardly suprising that Egestion typically refers to what comes out of a backside, so it's probably not a great option, at least in common usage 🙂
Others suggested that the terms:
Data Import
Data Export
should just be used instead. While I agree, it's interesting to note that the above terms are often used as nouns. And that got me wondering about when we started using verbs as nouns.
I never hear anyone talk about a Data Ingest, as though it's a "thing", only about Data Ingestion an an act. But we talk about performing a Data Import, and treating it as both an action, and the act of performing the action, yet I rarely hear anyone say Data Importation when they are discussing the action.
Language is curious.
So what is the best opposite for Data Ingestion or is the term best avoided in the first place?
Think we should choose a viewpoint: if we stay biological, it's Ingestion / Excretion.
Otherwise, think we should try to be consistent with other data processing language – if we talk about ETL, then the terms here would be [Data] Load / [Data] Extract.
On reflection, we're all a little misleading with our language: when a tooth is "Extracted it's no longer in its original jaw… If I Export a container of shoes, it does not stay in its original dock. Physical objects (Newtonian model) only exists in one location. In my experience we tend to *copy* data far more than we *move* it, especially when we're talking about bulk records (a SELECT statement results in copies of records in a result set which I can then store or pass on). So for our bulk data, perhaps 'Replicate' is a better term?
Even if you stay with biological, I think egestion would be a better choice than excretion. egestion is outputting things that were ingested and not digested. Excretion relates to the output of internal metabolic processes.
Language is odd.
This is the relevant conversation I needed today. My votes are for "data extraction", "data view(s)", and "data consumption". While I agree with the finer points of paying homage to the appropriate academic vocabulary at hand, from the perspective of language use I find my team and colleagues (across multiple geographies worldwide) using the simpler terms I share here.
Yes, language is tricky, but I always like simple. That's why I'd talked about Data Import and Data Export but the ones you mention are also simple.
I don't care for using biological terms to describe data processing concepts. But if we're going to do this, then let's at least be consistent. So if we use "Data Ingestion" instead of "Data Import," then instead of using, "an import error occurred," if the process fails, let's use, "data indigestion occured."
Love it
How about regurgitation as the opposite of ingestion?
Hi Don, yes, possibly, but regurgitation is usually both in then out.