Good to know - Using filters on data streams
We recently had a case where a customer had several million records we wanted to import into data cloud. We weren't overly concerned about the import cost since importing/Ingesting from Salesforce core is now free, but we were concerned about storage costs. Though storage costs are generally minimal, we follow best practices and try to limit storing data that is not necessary. As this was an AI used case, we were only focused on importing closed cases, and we are adding some additional filters. What was perplexing is that we saw that the full data set from cases were still showing up as a record count in data cloud.
After a quick search, we learned that Data filters are only applied after the data has been imported into Salesforce and work to limit what the data space can see. This was documented in a knowledge article (002721752) from Salesforce this past August. Data cloud storage is currently priced at $23 per terabyte per month so again, the cost will probably be fairly low for this used case, but it is difficult to determine the storage space on a particular Data lake object, so I cannot give accurate figures of the impact.
Another challenge is that once you've created the stream and the filter, you might think that can't edit the filter afterward, as there is nothing on the data stream edit page that allows this. This is one of those gotcha moments where you need to understand Data Cloud and have been working with it for sometime to know where to go to fix it. In this case, since it is a data space related filter, you need to go to the data spaces page and edit it from there. Luckily, we were aware of this and did not go through the entire arduous process of tearing things down just to rebuild the data stream.
Learnings:
- Data filters at the Data space level do not limit the amount of records imported into Salesforce
- Data filters can be changed after they are created (On the Data Spaces page)
References: