Selasa, 18 Agustus 2020

Show HN: Mp3 to Text https://ift.tt/3iSOC64

Show HN: Mp3 to Text https://ift.tt/314IdhQ August 18, 2020 at 09:33PM

Launch HN: Synth (YC S20) – Realistic, synthetic test data for your app https://ift.tt/3iUnFyy

Launch HN: Synth (YC S20) – Realistic, synthetic test data for your app Hey! Christos, Damien and Nodar here and we're the co-founders of Synth ( https://getsynth.com ) - Synth is an API which allows you to quickly and easily provision test databases with realistic data with which to test your application. We started our company about a year ago, after working at a quantitative hedge fund in London where we built models to trade US equities. Strangely, instead of spending time developing models or building the trading system, a large portion of our time was spent on just sourcing and on-boarding datasets to train and feed our models. The process of testing datasets and on-boarding them was archaic; one data provider served us XML files over FTP which we then had to spend weeks transforming for our models to ingest. A different provider asked us to spin up our own database and then sent us a binary which was used to load the data. We had to whitelist their API ip-address and setup a cronjob to make sure the dataset was never out of date. The binary provided an interactive input so it couldn't be scripted, or rather it could be but you need something to mock the interactive params. All this took a junior developer on the team a good 3-4 days to figure out and setup. Furthermore after our trial expired we decided we didn't actually need this dataset so those 3-4 days were essentially wasted. Our frustration around the status-quo in data distribution is what drove us to start our company. We spent the first 6 months building a privacy-aware query engine (think Presto but with built in privacy primitives), but software developers we talked to would frequently divert the topic to the lack of high quality, sanitised testing data during the software development lifecycle. It was strange - most of us developers and data scientists constantly use some sort of testing data for different reasons. Maybe you want a local development environment which is representative of production but clean from customer data. Or a staging environment which contains a much smaller, representative database so that tests run faster. You could want the dataset to be much bigger to test how your application scales. Maybe you want to share your database with 3rd party contractors who you don't necessarily trust. Whichever way you put it, it's strange that for a problem most of us face every day, we have no idiomatic solution. We write bespoke scripts and pipelines which often break. They are time consuming to write and maintain and every time your schema changes you need to update them manually. Or we get lazy and copy/paste production. We finally listened to all this feedback, dropped the previous product, and built Synth instead. Synth is a platform for provisioning databases with completely synthetic data. The way Synth works can be broken into 3 main steps. You first download our CLI tool (a bunch of python wrapped up in a container) and point it at your database to create a model (we host the models on the Synth platform). This model encodes your schema, and foreign key relationships as well as a semantic representation of your types. We currently use simple regular expressions to classify the semantic types (for example an address or license plate). The whole model is represented as a JSON object - if the classifier gets something wrong you can easily change the semantic type. Once the model has been created, the next step is to train the model. Under the hood we use a combination of copulas and deep-learning models to model the distributions and correlations in your dataset (the intuition here is that it's much more useful for developers to have realistic data than just sample from a random number generator). The final step is to use the trained model to generate synthetic data. You can either sample directly from the model or we can spin up a database for you and fill it with as much data as you need. The generation step samples from the trained model to create realistic data, as well as utilising bespoke generators for sensitive fields (credit card numbers, names, addresses etc.) You can run the entire lifecycle in a single command - you point the CLI tool at your database (currently Postgres, MySQL and MsSQL) and in ~1 minute you get an i.p. address and credentials to your new database with completely synthetic data. We're long time fans of HN and are eagerly looking forward to feedback from the community (especially criticism). We've made a free version available for this week so you can try it with no strings attached. We hope some of you will find Synth useful. If you have any questions we'll be around throughout the day. Also feel free to get in touch via the site. Thanks! ~ Christos, Damien & Nodar August 18, 2020 at 08:09PM

Show HN: Nice Ice – A widget for collecting user feedback with one LoC https://ift.tt/348W48K

Show HN: Nice Ice – A widget for collecting user feedback with one LoC https://niceice.io August 18, 2020 at 06:41PM

Show HN: ProgressKer The all-in-one progress tracker app for your daily routine https://ift.tt/3g6qPO9

Show HN: ProgressKer The all-in-one progress tracker app for your daily routine https://ift.tt/2EcpFn2 August 18, 2020 at 02:56PM

Show HN: RGB Color Spectrum Visualization Tool https://ift.tt/349lrat

Show HN: RGB Color Spectrum Visualization Tool https://ift.tt/3iOY1v7 August 18, 2020 at 11:31AM

Show HN: Made in India CSS https://ift.tt/34cS5rQ

Show HN: Made in India CSS https://ift.tt/2EePjHy August 18, 2020 at 01:05PM

Show HN: WizAtHome – Work from Home Wellness Management https://ift.tt/316nKsZ

Show HN: WizAtHome – Work from Home Wellness Management https://ift.tt/3hkwhOq August 18, 2020 at 11:04AM

Hay Day


via IFTTT

Show HN: Chrome extension: Gives Ctrl+F like find results using GloVe vectors https://ift.tt/31a71Fv

Show HN: Chrome extension: Gives Ctrl+F like find results using GloVe vectors https://ift.tt/1Tx74hR August 18, 2020 at 07:28AM

Show HN: Convert Kubernetes resources to helm charts with Palinarus https://ift.tt/3azeQHR

Show HN: Convert Kubernetes resources to helm charts with Palinarus https://ift.tt/3iK8BU8 August 18, 2020 at 02:29AM

Show HN: Lorempdf.com – Create sample PDFs quick and easy https://ift.tt/2E2Sdzp

Show HN: Lorempdf.com – Create sample PDFs quick and easy https://ift.tt/2E1pzii August 18, 2020 at 05:25AM

Show HN: Dropbase 2.0 – Turn your offline files into live databases, instantly https://ift.tt/2Y7KqXY

Show HN: Dropbase 2.0 – Turn your offline files into live databases, instantly https://ift.tt/3ehHn4I August 18, 2020 at 12:38AM

Show HN: I'm building a cloud cost tool for Terraform https://ift.tt/2E0IJVw

Show HN: I'm building a cloud cost tool for Terraform https://ift.tt/3dus4p8 August 18, 2020 at 12:10AM