A group of ex-NSA and Amazon engineers are building a ‘GitHub for data’


Six months ago or thereabouts, a group of engineers and developers with backgrounds from the National Security Agency, Google and Amazon Web Services had an idea.
Data is valuable for helping developers and engineers to build new features and better innovate. But that data is often highly sensitive and out of reach, kept under lock and key by red tape and compliance, which can take weeks to get approval. So, the engineers started Gretel, an early-stage startup that aims to help developers safely share and collaborate with sensitive data in real time.
It’s not as niche of a problem as you might think, said Alex Watson, one of the co-founders. Developers can face this problem at any company, he said. Often, developers don’t need full access to a bank of user data — they just need a portion or a sample to work with. In many cases, developers could suffice with data that looks like real user data.
“It starts with making data safe to share,” Watson said. “There’s all these really cool use cases that people have been able to do with data.” He said companies like GitHub, a widely used source code sharing platform, helped to make source code accessible and collaboration easy. “But there’s no GitHub equivalent for data,” he said.
And that’s how Watson and his co-founders, John Myers, Ali Golshan and Laszlo Bock came up with Gretel.
“We’re building right now software that enables developers to automatically check out an anonymized version of the data set,” said Watson. This so-called “synthetic data” is essentially artificial data that looks and works just like regular sensitive user data. Gretel uses machine learning to categorize the data — like names, addresses and other customer identifiers — and classify as many labels to the data as possible. Once that data is labeled, it can be applied access policies. Then, the platform applies differential privacy — a technique used to anonymize vast amounts of data — so that it’s no longer tied to customer information. “It’s an entirely fake data set that was generated by machine learning,” said Watson.
It’s a pitch that’s already gathering attention. The startup has raised $3.5 million in seed funding to get the platform off the ground, led by Greylock Partners, and with participation from Moonshots Capital, Village Global and several angel investors.
“At Google, we had to build our own tools to enable our developers to safely access data, because the tools that we needed didn’t exist,” said Sridhar Ramaswamy, a former Google executive, and now a partner at Greylock.
Gretel said it will charge customers based on consumption — a similar structure to how Amazon prices access to its cloud computing services.
“Right now, it’s very heads-down and building,” said Watson. The startup plans to ramp up its engagement with the developer community in the coming weeks, with an eye on making Gretel available in the next six months, he said.
Six months ago or thereabouts, a group of engineers and developers with backgrounds from the National Security Agency, Google and Amazon Web Services had an idea. Data is valuable for helping developers and engineers to build new features and better innovate. But that data is often highly sensitive and out…
Recent Posts
- Quordle hints and answers for Wednesday, February 19 (game #1122)
- Facebook is about to mass delete a lot of old live streams
- An obscure French startup just launched the cheapest true 5K monitor in the world right now and I can’t wait to test it
- Google Meet’s AI transcripts will automatically create action items for you
- No, it’s not an April fool, Intel debuts open source AI offering that gauges a text’s politeness level
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010