Main image of article Building a Machine-Learning Platform? Open-Source It!
To open source or keep things tightly proprietary—that’s the existential question that any number of companies and developers confront on a regular basis. When it comes to machine learning and artificial intelligence (A.I.), though, it seems that open source might prove the way to go, especially for smaller firms. Salesforce recently opened up TransmogrifAI, the machine-learning library that underlies its Einstein A.I. platform. Described on GitHub as an “AutoML library for building modular, reusable, strongly typed machine learning workflows on Spark,” TransmogrifAI is meant for building machine-learning models within the context of structured data such as spreadsheets—exactly the sort of thing that would appeal to Salesforce’s customer-service audience. Salesforce open-sourced TransmogrifAI the same day that Oracle did the same thing with Graphpipe. “Graphpipe is an attempt to standardize the protocol by which you speak to a remotely deployed machine learning model,” is how Oracle cloud architect Vish Abrams described the platform to VentureBeat. To put it another way, Graphpipe allows developers to more easily work with machine-learning models in the context of smartphones and IoT devices that depend heavily on a cloud-based server. (For those interested, here’s the GitHub repo.) Salesforce and Oracle are following on the heels of tech giants such as Google, which have already opened up a solid chunk of their machine-learning work to the wider world. It’s potentially a solid strategy for these companies that wish to attract companies and developers to their respective platforms: if you made the code easily accessible, more folks will try it out, and hopefully they’ll like it enough to integrate it into their tech stacks. For those companies that don’t have the resources or the reach of Google, open-sourcing their A.I. and machine-learning work comes with yet another benefit: a crowd of third-party developers and companies more than happy to provide suggestions, tweak code, and point out errors. Placing your code on GitHub encourages a positive feedback loop that can, potentially, improve a product very quickly—certainly much faster than if a small team of developers tried to iterate in isolation. Given how smaller companies will never have the funding or human resources to catch up to Google or Oracle in terms of A.I., open-sourcing might represent the best way to give a new initiative as much (metaphorical) oxygen as possible.