The Alation Data Intelligence Platform
Data lineage has reduced incidents and improved impact analysis but search still needs significant work
What is our primary use case?
I am the administrator and server admin for Alation Data Catalog. We create custom processes to upload metadata into Alation Data Catalog. We are trying to get as much metadata into Alation Data Catalog as possible and integrate it with other tools that we are also building. The main purpose of Alation Data Catalog at Upstart is to help people looking for data find the data they need and how to use that data.
Recently, many people are trying to deprecate tables in the source databases and find what downstream impacts that would have. They use Alation Data Catalog lineage to determine if there are any obvious uses of these tables. This is the biggest use case recently. People also use it to find descriptions and service owners.
What is most valuable?
I use Alation Data Catalog's analytics piece, Alation Analytics, to create dashboards in Databricks based on annotations or source comments on the databases themselves. This allows us to hold teams who own these tables accountable for missing annotations or possibly incorrect annotations on the source databases.
The best feature that we have used the most in Alation Data Catalog is the lineage and impact analysis piece. This is used in other services where we call Alation Data Catalog API to get the lineage and then use that as input to some LLM models and other services that another team builds for lineage services. We use the API extensively, and it is a great piece to get context. It is great to have all the context in one place, and we can use inputs to LLMs to chat with data when we create our own solutions, since Alation Data Catalog does not offer the exact answer we want for our custom LLMs yet. We are creating that as context input, and we are able to see descriptions, types of columns, and search easily, even though the search can be a little inconsistent sometimes.
Alation Data Catalog's scalability has met our needs as our data and user base has grown. We mostly need viewer access, so the licensing with the cloud-native solution allows unlimited viewer access. There are no other limits that we have even found except some of the API limits on the free version where we would have to pay if we wanted to increase our API calls.
What needs improvement?
Alation Data Catalog API integration has mostly been smooth. We have seen issues with the LLM API pieces or lineage, but there is not clarity around what the costs are, how many requests exist, and what constitutes a request to this API. One issue that we found before, which I believe is solved now, is using the same refresh token in multiple different processes that could run in parallel. The refresh token would create an API token, and the API token would then get invalidated if another process uses the same refresh token to create an API token. We are now starting to use service accounts by Alation Data Catalog and will have different service accounts for each process, so this does not happen anymore.
The current search functionality in Alation Data Catalog is not necessarily great. It does not do natural language searching as we would prefer. It mostly searches titles, and we would prefer it to search the description and some of the source comments to find answers that we need. Additionally, we would like to be able to search common queries or the queries that are used most frequently for certain aspects so people can get ideas of what queries they can use when they want to find a specific metric.
We have been able to find downstream impacts easier with Alation Data Catalog. Our machine learning team uses it quite often to find certain data. Our analytics team has yet to adopt it as much as we would prefer because it does not have the easy finding features for searching and finding queries that they can use or endorsed queries. They want to figure out how to use the tables and join them to find other metrics, which is difficult in the current state, though I believe there are things in the future to improve on. The other aspect is holding data producers accountable and being able to see who owns tables. Currently, that is a manual process, but we are creating an automated process to add owners to tables. If anybody has a question on a table that the description may not have answered, then we can find that out through the owner of the data in Alation Data Catalog table page.
We do not have as many P1 incidents anymore based on anecdotal evidence. Previously, a change may have happened and people did not know about the downstream impacts, which caused a lot of issues. Now it is easier to mitigate or just not encounter the P1 incidents in the first place.
The search feature of Alation Data Catalog could be improved. Alation Data Catalog Compose is also interesting in that we cannot search queries or see queries in the table page that are not published unless we go to the query history. We do not allow Compose on many items right now due to information security. Our security requirements do not allow Alation Data Catalog to access the underlying connections because we do not want people to pull in data. From a security standpoint, that is an issue, and we would like to have workarounds in certain cases. The other issue that we have found recently is along the same lines of security. We do not want to automatically sample tables in Alation Data Catalog because there could be some issue or we do not want that data being stored on a different server. However, if we have a schema that is enabled for sampling, then any new table automatically gets enabled for sampling. We have had to work around this by trying to figure out the correct permission and setup on the Databricks side to not allow Alation Data Catalog to sample certain tables if we do not want it to, because it is not feasible to do that on Alation Data Catalog side.
For how long have I used the solution?
I have been using Alation Data Catalog for over a year at this point.
What do I think about the stability of the solution?
Alation Data Catalog is stable for the most part. There are some issues with long load times for DBT models, but for the most part, it is stable. It has never gone down when we wanted to access it.
What do I think about the scalability of the solution?
Alation Data Catalog's scalability has met our needs as our data and user base has grown. We mostly need viewer access, so the licensing with the cloud-native solution allows unlimited viewer access. There are no other limits that we have even found except some of the API limits on the free version where we would have to pay if we wanted to increase our API calls.
How are customer service and support?
We do interact with customer support for Alation Data Catalog quite often for certain issues like missing Looker lineage in some Redshift tables. The support time can vary. If the issue is well known, it is usually quick or easy to figure out. On the more complex issues like missing lineage, those took much longer than we would prefer because the engineering team has to get involved. Some of them have dragged on for six months, which is much longer than we would ever want some support issues to go on.
What was our ROI?
We do see a return on investment, though I cannot share exact numbers. There are fewer questions being asked in our ask data platform channel or some other channel about data itself since the implementation of Alation Data Catalog.
What other advice do I have?
My advice for others looking into using Alation Data Catalog is to use the cloud-native solution. Using on-premises is much more difficult to deal with, and it is much more time-saving to use Alation Data Catalog cloud service provided by Alation so you do not have to deal with most of the things. Make sure that the security requirements are met if you want to pull in data. Ensure that your security best practices allow that and there is no sensitive information that could potentially get leaked into Alation Data Catalog causing you to have to ask Alation Data Catalog to delete that data, which can take some time. I would rate this product a 7 overall.
User-Friendly Data Catalogue, But Data Lineage Needs Improvement
Great Data Cataloguing, But Lineage UI Needs Improvement
Seamlessly Enhances Data Discovery and Collaboration
Adaptable for Business, But Missing Key Features
Effortless Navigation, Comprehensive Data Governance
Exceptional Tool and Team, Expanding Our Data Governance
While the tool itself is top-tier, I would be remiss not to acknowledge the Alation team. From sales to onboarding and ongoing support, Alation invests in the relationship and genuinely listens to client concerns. Their team is a valued partner to our governance program.
Lineage tracking has supported analyst onboarding and streamlined access to data knowledge
What is our primary use case?
I have been using Alation Data Catalog for the past year now.
My main use case for Alation Data Catalog is updating data descriptions and making sure analysts have what they need to do their job.
We mainly use Snowflake and DBT to update descriptions and documentation that then gets automatically pulled into Alation Data Catalog for analysts to use.
What is most valuable?
I really appreciate the lineage functionality that Alation Data Catalog offers.
The lineage functionality helps me in my work by allowing analysts to know how downstream tables are being built and views, and it can also be connected to Power BI, which is impressive.
Alation Data Catalog has positively impacted my organization by allowing analysts to be onboarded much quicker and easier as a lot of their questions are answered within Alation Data Catalog. This lowers the burden of other analysts within the business having to teach and answer queries that would already be answered within Alation Data Catalog.
What needs improvement?
Some of the downsides about the lineage is that some of the code is quite difficult to read as the formatting is a bit ugly sometimes.
Alation Data Catalog can be improved with the lineage formatting, which is a little bit hard to read sometimes, and there are things like data flow objects where the naming conventions are a bit odd. I believe the catalog sets could be improved since currently, we have a lot of catalog sets just to apply stewards to different schemas, columns, and tables.
What do I think about the stability of the solution?
From my experience, Alation Data Catalog is stable.
What do I think about the scalability of the solution?
Alation Data Catalog's scalability is really good since everything is cloud and everything gets pulled into the system via Snowflake and DBT automatically.
How are customer service and support?
So far, customer support from Alation is good; we have been able to contact them, and they have been able to help us whenever we have had issues, as well as made some improvements from our feedback.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We did not previously use a different solution.
How was the initial setup?
I was not personally involved in the setup of the pricing for Alation Data Catalog, but I do know that we were awarded or given a lot of free viewer licenses, which helped a lot.
What was our ROI?
I have not seen a return on investment, and I have not measured those things yet since we have not had it long enough; it still takes quite a bit of time to update and sell the product to the business.
Which other solutions did I evaluate?
I was not part of the discussions before choosing Alation Data Catalog, and I do not think there were other options; Alation was the only option.
What other advice do I have?
My advice to others looking into using Alation Data Catalog is to sort their metadata out in Snowflake and their source system first before developing any connections to Alation.
My company does not have a business relationship with Alation other than being a customer.
I gave Alation Data Catalog a review rating of 9.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Uses metadata cataloging and data lineage to support data governance and improve data discoverability
What is our primary use case?
My main use case for Alation Data Catalog is cataloging various business domains, technical metadata, and business metadata for them.
I use Alation Data Catalog for cataloging business domains by cataloging different data sources, such as Snowflake, Power BI, Informatica, and DBT, so we integrate these sources with the business domains.
We use the Alation Data Catalog cloud version, ACS, and we utilize its lineage, its governance techniques, and as well as its cataloging of metadata objects.
What is most valuable?
The best features Alation Data Catalog offers include data discovery, data cataloging, data storage, technical and business metadata for your business domains, and lineage for data flow visibility.
The data lineage feature in Alation Data Catalog gives an idea of how the data is being transformed from different sources towards the end product, to the data product level, which helps my team day-to-day.
Data discovery in Alation Data Catalog allows me to access all the data within the different data sources with a single click. Alation Data Catalog has positively impacted my organization by improving the data marketplace section, the data utilization section, and the quality of the data.
There has been an increase in data usage and also time is being saved as specific outcomes related to those improvements.
What needs improvement?
I notice a couple of pain points with Alation Data Catalog: they can improve on the lineage perspective and improve their data sources integration.
They can improve the search capabilities for the data.
For how long have I used the solution?
I have been using Alation Data Catalog for a year.
What other advice do I have?
My advice for others looking into using Alation Data Catalog is to try to utilize the data discoverability, data profiling features, and improve the data quality features of Alation Data Catalog.
I think it is a good cataloging tool for governance activities. I would rate this product an 8 out of 10.
Empowers business users to independently access and understand data through a centralized glossary
What is our primary use case?
My main use case for Alation Data Catalog is for data governance and for non-tech business users who especially want to understand some tech terms. For example, we have a huge data warehouse where every field in each database, in that particular table, in that particular schema, needs to be explained, and we try to streamline that technicality in a user-friendly non-technical way. That is where we use Alation Data Catalog to do the data governance and to provide information to non-technical business users.
Majorly, we do data governance and help business users, including non-tech business users, and this is the main use case for us.
My advice for others looking into using Alation Data Catalog is that it is a great tool, as it can help non-business users, has a robust business glossary, and aids in making faster, strategic decisions while improving productivity. I would highly advise others to use this tool as a data governance tool.
What is most valuable?
I appreciate the features in Alation Data Catalog, including data quality, faster data discovery, data stewardship, privacy choices, tracking data usage, collaboration, data governance, and open connected frameworks. Data stewardship is something I really use a lot, as it describes the activity associated with curating and governing data so that other users can find, understand, trust, or use it. A good data catalog should empower data leaders to identify data stewards based on the work already done, assign relevant assets to stewards, and clearly define responsibilities for them, fostering accountability and ensuring that the data is correctly handled throughout its lifecycle.
When it comes to data stewardship, Alation Data Catalog makes my job easier by providing a robust business glossary, which is another key tool for stewardship leadership to seek in a data catalog. It serves as a central repository that lays out a clear data-backed definition of terms, such as profit, with business metrics to ensure clear, consistent communication across departments. The stewardship capability I use majorly is a robust business glossary, which I wrote and managed.
Alation Data Catalog positively impacts my organization by improving data efficiency, data quality, and collaboration. My stakeholders can find relevant data assets quickly using a Google search without needing technical jargon, reducing the time spent searching for data. It has also increased productivity by streamlining data discovery and providing context, allowing analysts to spend more time on analysis instead of searching and preparing, so they can just start their analysis. Business users are empowered to access and understand data independently without relying on IT teams or developers for ad-hoc requests. For example, even my product owner does not usually go directly to the data engineer or developer with any doubt; she can search in Alation and understand the business glossary in a user-friendly manner. Additionally, when a new analyst or new member is onboarded, it helps them to understand things more quickly thanks to the centralized documentation in the catalog or business glossary.
What needs improvement?
Regarding improvements for Alation Data Catalog, I mention pain points, but I find it difficult to identify any. I do not have any pain points with Alation Data Catalog as of now; I could not think of anything that needs improvement.
For how long have I used the solution?
I have used Alation Data Catalog for 1.5 years, and I closely work with the product owner who is the product owner of Pizza Hut Global, where we do some data governance and data steward work, which is where I use Alation Data Catalog.
What do I think about the stability of the solution?
In my experience, Alation Data Catalog is stable, as I have not encountered any downtime or reliability issues.
What do I think about the scalability of the solution?
Alation Data Catalog handles growth and increased data volume well for my organization, as it has good scalability.
How are customer service and support?
I have not needed to contact customer support for Alation Data Catalog, so I have not faced any issues with customer support, and everything is sorted for our team.
Which solution did I use previously and why did I switch?
I did not previously use a different solution for data governance or cataloging before Alation Data Catalog.
What about the implementation team?
I think we are just consumers of the service from Alation and not in a business relationship as a partner or reseller.
What was our ROI?
I can share that when I joined, Alation Data Catalog helped me to understand things and reduce my time, cutting it by 50 percent. Instead of going into Google or consulting a colleague, I can just go to Alation Data Catalog, search there, and understand everything by myself. Thus, it reduced my time and saved my colleagues' time, so I can say at least 50 percent of my time is saved.
What's my experience with pricing, setup cost, and licensing?
I do not have any insight on the pricing, setup cost, or licensing, as those matters are handled by the finance team, so I cannot comment much on it.
Which other solutions did I evaluate?
I am not certain of other Alation Data Catalog solutions since my company was already using it when we started.
What other advice do I have?
Alation Data Catalog is deployed in our organization on a cloud. We use both Azure and Google Cloud for Alation Data Catalog. Alation Data Catalog leads to faster decision making, as it helps in providing a single simplified view of data and clear definitions, allowing teams to find and use the right data more quickly, leading to better-informed and faster strategic decisions. I would say faster decision making is a key benefit. I would rate this review overall as a 10.