Dell, EMC, Dell Technologies, Cisco,

Showing posts with label TensorFlow. Show all posts
Showing posts with label TensorFlow. Show all posts

Monday, July 23, 2018

Ready-to-deploy deep learning solutions

Accelerate your deep learning project deployments with Radeon Instinct™ powered solutions Deep learning adoption is lagging as companies struggle with how to make it work. Now a new ecosystem is rising to deliver the integrated pieces that ultimately will be part of one turnkey system for deep learning. Automation has proved its worth in meeting IT and business objectives. Even so, efficiencies in automation and work augmentation software can be greatly enhanced with deep learning. Yet deep learning adoption rates are low. That’s in part because the tech is difficult, and the talent pool is thin. The good news is that an ecosystem is forming and already beginning to resolve some of these issues as it continues to grow towards becoming a single turnkey system. Why it takes an ecosystem A Deloitte report found that fewer than 10% of the companies surveyed across 17 countries invested in machine learning. The chief reasons for the adoption gap is a lack of understanding on how to use the technology, an insufficient amount of data to train it with, and a shortage of talent who could make it all work. Translated in the simplest of terms, deep learning is perceived by some to be too hard to deploy for practical use. The solution for that dilemma is what it has always been for any new technology requiring esoteric skill sets and faced with a talent shortage – build an easy-to-use, turnkey system. That is, of course, easier said than done. “The ongoing digital revolution, which has been reducing frictional, transactional costs for years, has accelerated recently with tremendous increases in electronic data, the ubiquity of mobile interfaces, and the growing power of artificial intelligence (AI),” according to a McKinsey & Company report. “Together, these forces are reshaping customer expectations and creating the potential for virtually every sector with a distribution component to have its borders redrawn or redefined, at a more rapid pace than we have previously experienced.” That’s why today’s sophisticated and complex systems are commonly constructed not by a single vendor but by a strong and diverse ecosystem capable of delivering the many moving parts needed to make a single turnkey system. Especially when said systems must be equally workable for companies across industries and with diverse needs. As a result, ecosystems are growing at breathtaking speeds. McKinsey & Company analysts predict that new ecosystems are likely to entirely replace many traditional industries by 2025. Such an ecosystem is forming for machine learning. It’s seeded with four recently launched, ready-to-deploy solutions. They center on AMD’s Radeon Instinct training accelerator for machine learning, and its ROCm Open eCosystem (ROCm), an open source HPC/Hyperscale-class platform for GPU computing. AMD takes open source all the way down to the graphics card level. Open source is key to successfully wrangling machine learning systems as it leverages the skills and coding work from entire communities and makes an ecosystem functional across technologies and applications. The ROCm open ecosystem This newly forming ecosystem is optimal for beginning or expanding your deep learning efforts whether you are the IT person looking to get pre-configured deep learning technologies in place, or the scientist who just needs access to HPC systems with one of the frameworks loaded. Either way, users can quickly get to work with their data. Developers also have full and open access to the hardware and software which speeds their work in developing frameworks. Everything AMD develops for its Radeon Instinct system is open source and available on GitHub. The company also has docker containers for easier installs of ROCm drivers and frameworks which can be found on the ROCm site for Docker.  Caffe and TensorFlow machine learning frameworks are offered now, with more to follow soon. A deep learning solutions page has gone live, which features the four systems that service as the bud of the blooming ecosystem rooted in AMD technologies. The frameworks docker containers will be listed there as well. This budding machine learning ecosystem is already bearing fruit for organizations looking to launch machine learning training and applications with a minimum of technical effort and expertise by combining: Fast and easy server deployments ROCm Open eCosystem and infrastructure Deep learning framework docker containers Optimized MIOpen framework libraries The four systems forming the ecosystem center “Data science is a mix of art and science—and digital grunt work. The reality is that as much as 80 percent of the work on which data scientists spend their time can be fully or partially automated,” according to a Deloitte report. This newly forming ecosystem is focused on automating much of the machine learning processes. While complicated to achieve, the end results are far easier for organizations to use. Deloitte identified five key vectors of progress that should help foster significantly greater adoption of machine learning by making it more accessible. “Three of these advancements—automation, data reduction, and training acceleration—make machine learning easier, cheaper, and/or faster. The others—model interpretability and local machine learning—open up applications in new areas,” according to the Deloitte report. There are four prebuilt systems shaping this ecosystem early on. Each is provided by an independent partner and built on or for AMD’s Radeon Instinct and ROCm platforms, but their initial presentations are at varying levels of integration. While more partners will join the ecosystem over time, these four provide a solid bedrock for organizations looking to get started in machine learning now. 1) AMAX is providing systems with preloaded ROCm drivers and a choice of framework, either TensorFlow or Café, for machine learning, advanced rendering and HPC applications. 2) Exxact is similarly providing multi-GPU Radeon Instinct-based systems with preloaded ROCm drivers and frameworks for deep learning and HPC-class deployments, where performance per watt is important. 3) Inventec provides optimized high performance systems designed with AMD EPYC™ processors and Radeon Instinct compute technologies capable of delivering up to 100 teraflops of FP16 compute performance for deep learning and HPC workloads. 4) Supermicro is providing SuperServers supporting Radeon Instinct machine learning accelerators for AI, big data analytics, HPC, and business intelligence applications. The payoff from leveraging the technologies in a machine learning ecosystem potentially comes in many forms. “A growing number of tools and techniques for data science automation, some offered by established companies and others by venture-backed start-ups, can help reduce the time required to execute a machine learning proof of concept from months to days. And, automating data science means augmenting data scientists’ productivity in the face of severe talent shortages,” say the Deloitte researchers.

https://www.hpcwire.com/2018/07/23/ready-to-deploy-deep-learning-solutions/

Sunday, February 25, 2018

Google TPUs open up on cloud; LinkedIn intros Hadoop Dynamometer

Signs continue to point to the fact that big data is getting bigger. More importantly, its very bigness sets the tone for innovation. This trend is seen in new releases: one, a LinkedIn large-scale @Hadoop test system, known as @Dynamometer; the other, @Google #TPUs, or @Tensor Processing Units, available as a service on the #GoogleCloudPlatform. Both are discussed in this edition of the Talking Data podcast. Machine and deep learning are both generally seen as means to turbocharge predictive analytics. Google, with its massive data centers and eager army of technologists, has been at the forefront of the technology, with a particular showcase being TensorFlow. This is an open source framework the search giant has fashioned for the highly recursive task of neural processing on massive data sets. Google TPUs represent a specialized hardware approach to such neural processing. The hardware is proprietary to Google and, as is discussed in the podcast, is of a type somewhat out of the reach of typical IT shops. In February, the company announced that Google TPUs would be available in beta on the Google Cloud Platform. The TPUs reside four to a board, and they can be connected as pods via an ultra-fast, dedicated network. Also discussed in the podcast is Dynamometer, which was open sourced by LinkedIn this month in an effort to improve testing of Hadoop Distributed File System (HDFS) clusters. Such testing has become an issue as cluster node counts have gone higher into the thousands of nodes. Appearing in this edition of Talking Data is Mike Matchett, analyst and founder of the Small World Big Data consultancy. According to Matchett, high performance Hadoop testing is difficult if teams are required to gather data and configure setups that match the production implementation node for node. The LinkedIn approach, he indicated, takes a novel tack, matching physical HDFS name nodes with simulated data nodes.

https://www.google.com/url?rct=j&sa=t&url=http://searchdatamanagement.techtarget.com/podcast/Google-TPUs-open-up-on-cloud-LinkedIn-intros-Hadoop-Dynamometer&ct=ga&cd=CAEYACoTMjgzNDIyODQ2NzcxODU5ODE5NzIaYzExZGU5NDRkODAyNTRmYjpjb206ZW46VVM&usg=AFQjCNEQjvyGEX80BjyGY3hSHrCwLr-r_A

Thursday, September 28, 2017

Open-source community pushing big data into AI realm

What’s the surest way to advance a technology in a short time? Give it away — to an open-source community. Seminal big data software library @Apache @Hadoop gained momentum in #opensource, and today, most disruptive #bigdata development is springing from open source as well. “If people have the community traction, that is the new benchmark,” said John Furrier (@furrier) (pictured, left), co-host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio. This is evident at SiliconANGLE’s and theCUBE’s BigData NYC 2017 event, where Furrier and co-host James Kobielus (@jameskobielus) (pictured, right) discussed the community edge. Yahoo Inc. just open-sourced its big data search and recommendation software Vespa, following its hugely popular 2006 Hadoop contribution. It clearly believes Vespa can evolve via open-source developer brains just as Hadoop did. “As the community model grows up, you’re starting to see a renaissance of real creative developers,” Furrier said. These developers are not just working out implementation kinks; they’re innovating at a level that makes a difference for applications. “Real creative competition — in a renaissance, that’s really the key,” Furrier stated. The renaissance will be automated Much new development branches out from big data per se, into artificial intelligence, machine learning and internet of things. “Data professionals and developers are moving toward new frameworks like TensorFlow,” Kobielus said. TensorFlow is Google’s open-source deep learning framework. Caffe and Theano are additional open-source deep learning technologies with bustling communities around them. Some of the most exciting work happening in open-source (and at Stanford University) revolves around automating the acquisition of data needed to train machine learning models. Many would like to see deep learning tools and methods operationalized, enabling what some call DataOps or InsightOps (IBM’s term), Kobielus pointed out. “I think what are coming into being are DevOps frameworks to span the entire life cycle of the creation and the training and deployment and iteration of AI,” he said. Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of BigData NYC 2017.

https://siliconangle.com/blog/2017/09/27/open-source-community-pushing-big-data-ai-realm-bigdatanyc/

Sunday, July 2, 2017

Kinetica Gets $50M for Converged GPU Analytics

#Kinetica "s bold plan to build a #converged real-time analytics platform that uses #GPU s and in-memory techniques to power existing #SQL queries alongside deep learning algorithms got a big boost today when it disclosed a $50 million Series A investment from venture capitalists. The use cases for Kinetica‘s technology have evolved over time. It was initially incubated by the US Army and the NSA to build a system capable of tracking terrorist movements in real time, using data from drones and signals intelligence. After changing its name from GIS Federal and moving from Virginia to San Francisco, the company expanded into the enterprise market, and one of its first customers was the US Postal Service, which uses Kinetica to track and analyze the movement of about 200,000 vehicles at one-minute intervals. Kinetica’s capability to speed SQL queries on top of superfast GPU processors has always been one of its main calling cards. It’s what led customers to use its database, dubbed GPUdb, to speed up slow Tableau or Qlik queries, or to provide a real-time analytics layer on top of Hadoop. But last year’s introduction of user defined functions paved the way for Kinetica users to integrate machine learning and deep learning libraries, including TensorFlow, Caffe, and Torch, into the platform. “We’re evolving as a company as well as technology,” Kinetica Co-Founder and CEO Amit Vij tells Datanami. “We’re becoming a converged platform that enables an enterprise to use a single unified technology to access a relational database and run various complex algorithms, analytics, and BI.” An ‘Apple’ Experience The idea is positioning GPUdb as single platform that can execute a wide range of analytic workloads that today are loosely referred to as “big data analytics.” Or in Vij’s words, it’s about “creating an Apple experience” for his customers.

It’s all about taking what was complex and difficult, and making it simple and easy, Vij says.

“It feels like organizations are having to duct tape five to 10 technologies that were loosely created on different release cycles and then spend several months, if not several years trying to put it into production,” Vij says, “and it’s still batch oriented and they don’t get real time results for their company.”

Hadoop was supposed to be that central platform for big data analytics, Vij acknowledges, but for whatever reason, it hasn’t worked out. “Hadoop is still fundamentally a good file system,” he says. “But [the Hadoop solutions] weren’t created to be an in-memory database, whereas that was our sole purpose when we first started.”

Hadoop is still good, Vij continues. “For organizations that have massive amounts of data, it’s an excellent place to store data and have the data lake,” he says. “We integrate with Hadoop, and we’re a fast layer that can be architected…on top to provide that real-time analytics for specific problems.”

Broad GPU Appeal

Kinetica today sees three primary use cases for its in-memory, GPU-accelerated relational database.

The first one is providing location-based analytics, which is applicable across industries and the government. The second one is speeding up OLAP queries submitted by users of BI tools. The third is the newest and involves incorporating the latest machine learning and deep learning algorithms into the data analytics workflow.

Vij elaborated on how deep learning and AI approaches will mesh with the Kinetica platform:

“If you have a multi-billion row dataset in Kinetica, you’re going to run various database filters and aggregates and create your training data set,” he says. “You take this new data set formulated by the database, and create a trained model within Google TensorFlow. And now you can execute that model against a table within Kinetica or a new materialized view that’s created.”

One Kinetica customers is a bank that’s using this approach to power its daily financial risk exposure calculations. They were using a combination of tools, including models written in Python and executed on Spark, but they were only able to work on a subset of the data.

“We enable this organization to do this all in real time and operate on the entire data corpus,” Vij says. “Because there’s a reduced amount of complexity in installing and working with our technology, organizations in turn need less personal. You don’t need a PhDs that’s expert in Hadoop and another one on Spark and another in ML. You can condense all of that.”

Analytics Looking Forward

The Series A investment of $50 million was co-led by Canvas Ventures and Meritech Capital Partners. Vij says the company plans to use the resources to fuel engineering, sales, and marketing initiatives. On the engineering front, the company plans to bolster its SQL compliancy, work on further integrating machine learning and AI into the platform, and push deeper into the cloud.

Vij says the biggest bottleneck at this point is getting NVidia GPUs into its client data centers. That’s why the recent adoption of GPUs by cloud providers has been such a good thing for Kinetica’s business model.

“With our smaller customer, the folks who can’t afford an IBM Netezza, SAP HANA, or OracleExadata, many times they start out on one to two servers on the cloud,” Vij says. “Things have definitely evolved.”

https://www.datanami.com/2017/06/29/kinetica-gets-50m-converged-gpu-analytics/

Thursday, June 29, 2017

TensorFlow to Hadoop By Way of Datameer

Companies that want to use #TensorFlow to execute deep learning models on big data stored in #Hadoop may want to check out the new #SmartAI offering unveiled by #Datameer today. Deep learning has emerged as one of the hottest technique for turning massive sets of unstructured data into useful information, and Google‘s Tensorflow is arguably the most popular programming and runtime framework for enabling it. So it made sense that Datameer, which was one of the first vendors to develop a soup-to-nuts Hadoop application for big data anatlyics, has now added support for TensorFlow into its Hadoop-based application. With today’s unveiling of SmartAI, Datameer is providing a way to execute and operationalize TensorFlow models. “The objective here is to take the stuff that mad scientists are coming up with, and actually take it to the business,” Datameer’s Senior Director of Product Marketing John Morrell tells Datanami. SmartAI, which is still in technical preview, is not helping data scientists to create the models. They will still do that in their favorite coding environment. Nor is it set up to train the models. If you’re interested in learning about how that can be accomplished on Hadoop, Hortonworks has a good blog post on integrating TensorFlow assemblies into YARN. Rather, Datameer’s new app is all about solving some of the thorny “last mile” problems that organizations often encounter as they’re moving a trained TensorFlow model from the lab into production. “AI today has had some problems in terms of operationalization,” Morrell says. “When a data scientist come up with a formula using their data science tools, they just chuck it over the wall to IT guy, who then tries to turn it into code, and custom code the whole thing.”

https://www.datanami.com/2017/06/28/tensorflow-hadoop-way-datameer/

Wednesday, June 28, 2017

Google Stakes Its Future on a Piece of Software

Early in 2015, #artificialintelligence researchers at #Google created an obscure piece of software called ­#TensorFlow. Two years later the tool, which is used in building machine-­learning software, underpins many future ambitions of Google and its parent company, #Alphabet. TensorFlow makes it much easier for the company’s engineers to translate new approaches to artificial intelligence into practical code, improving services such as search and the accuracy of speech recognition. But just months after TensorFlow was released to Google’s army of coders, the company also began offering it to the world for free.

https://www.technologyreview.com/s/608094/google-stakes-its-future-on-a-piece-of-software/

Thursday, February 16, 2017

Yahoo & Microsoft open source data analytics tools for Spark & Graph Engine

#Yahoo and #Microsoft have open sourced two data analytics tools that can help businesses to become more data-driven. Yahoo, has decided to open source the #TensorFlowOnSpark software that was created to make #Google ’s #TensorFlow open source framework compatible with the data sets that sit inside #Spark clusters. To grossly simplify this, TensorFlow is an open source software library that users can tap in to for numerical computation using data flow graphs. The company said: “TensorFlowOnSpark brings scalable deep learning to Apache Hadoop and Apache Spark clusters. By combining salient features from deep learning framework TensorFlow and big-data frameworks Apache Spark and Apache Hadoop, TensorFlowOnSpark enables distributed deep learning on a cluster of GPU and CPU servers.” It’s easy to forget that Yahoo has some brilliant technical minds working at the company, what will all the issues regarding the data breaches at the Internet side of the business, but the company has history in the big data world.
http://www.cbronline.com/news/big-data/analytics/yahoo-microsoft-open-source-data-analytics-tools-spark-graph-engine/

Sunday, September 25, 2016

What's in that photo? Google open-sources caption tool in TensorFlow that can tell you

#Google has open-sourced a model for its machine-learning system, called #ShowandTell, which can view an image and generate accurate and original captions.

The model it's released is faster to train and better at captioning images than the versions that previously helped it secure a tied first place with #MicrosoftResearch in #Microsoft 's COCO 2015 image-captioning contest.

The image-captioning system is available for use with #TensorFlow, Google's open machine-learning framework, and boasts a 93.9 percent accuracy rate on the ImageNet classification task, inching up from previous iterations.

The code includes an improved vision model, allowing the image-captioning system to recognize different objects in images and hence generate better descriptions.

An improved image model meanwhile aids the captioning system's powers of description, so that it not only identifies a dog, grass and frisbee in an image, but describes the color of grass and more contextual detail.

The improvements, detailed in a new paper, apply recent advances in computer vision and machine translation to image-captioning challenges. Google researchers see potential for it as an accessibility tool for visually-impaired people when viewing images on the web.
http://www.zdnet.com/article/whats-in-that-photo-google-open-sources-caption-tool-in-tensorflow-that-can-tell-you/

Sunday, August 28, 2016

Google Cloud shut down this guy's business — but now he's a fan for life

DocGraph CEO and CareSet Systems CTO Fred TrotterFred Trotter On Monday, @FredTrotter, CEO of a healthcare startup called #DocGraph, came into work only to discover that his cloud computing provider, #Google, had effectively shut down his company, sending him and his team into a panic. DocGraph, through its sister company, CareSet, sells Medicare data and analysis to help improve patient care and track the effectiveness of drugs. It not only stores its data with Google, but also relies on Google's machine learning service, #Tensorflow, to help it with the analysis. Which meant that when Google shut off access to his whole project, not just a single problematic service, he couldn't simply move his app to another cloud and start serving his customers again. But by Friday, Trotter was so impressed with Google and how the cloud team ultimately and finally handled the situation that he's sticking with them, he told Business Insider.

http://www.businessinsider.com/google-cloud-won-skeptic-after-shutting-site-down-2016-8

Wednesday, July 20, 2016

Here's Why Google Is Open-Sourcing Some Of Its Most Important Technology

We think of art as the most human of endeavors, but in recent years we’ve learned that machines can understand creativity too. There are algorithms that evaluate songs and movies for record companies and movie studios. One music professor even created a program that wrote compositions which drew critical acclaim.

Paradoxically, developing algorithms that can create artistic works pushes the bounds of human capability. Unlike machines that, say, dig holes or build cars, algorithms that produce creative work need to understand things that even humans find difficult to articulate. That’s the idea behind #Google ’s #Magenta project, which is developing machine learning tools for art and music

Magenta is built on top of #TensorFlow, the library of machine learning tools that the company recently released as an open source technology, allowing anyone who wants to download the source code. To get a sense of why Google would open up its most advanced technology, which is at the heart of its most important products, I asked company executives about it.

http://www.forbes.com/sites/gregsatell/2016/07/18/heres-why-google-is-open-sourcing-some-of-its-most-important-technology/#87b3248630c5

Monday, November 9, 2015

Google Just Open Sourced TensorFlow, Its Artificial Intelligence Engine

TECH PUNDIT TIM O’Reilly had just tried the new #GooglePhotos app, and he was amazed by the depth of its artificial intelligence.

O’Reilly was standing a few feet from #Google CEO and co-founder @LarryPage this past May, at a small cocktail reception for the press at the annual #GoogleI/O conference—the centerpiece of the company’s year. Google had unveiled its personal photos app earlier in the day, and O’Reilly marveled that if he typed something like “gravestone” into the search box, the app could find a photo of his uncle’s grave, taken so long ago.

The app uses an increasingly powerful form of artificial intelligence called deep learning. By analyzing thousands of photos of gravestones, this AI technology can learn to identify a gravestone it has never seen before. The same goes for cats and dogs, trees and clouds, flowers and food.

The Google Photos search engine isn’t perfect. But its accuracy is enormously impressive—so impressive that O’Reilly couldn’t understand why Google didn’t sell access to its AI engine via the Internet, cloud-computing style, letting others drive their apps with the same machine learning. That could be Google’s real money-maker, he said. After all, Google also uses this AI engine to recognize spoken words, translate from one language to another, improve Internet search results, and more. The rest of the world could turn this tech towards so many other tasks, from ad targeting to computer security.

Well, this morning, Google took O’Reilly’s idea further than even he expected. It’s not selling access to its deep learning engine. It’s open sourcing that engine, freely sharing the underlying code with the world at large. This software is called TensorFlow, and in literally giving the technology away, Google believes it can accelerate the evolution of AI. Through open source, outsiders can help improve on Google’s technology and, yes, return these improvements back to Google.

“What we’re hoping is that the community adopts this as a good way of expressing machine learning algorithms of lots of different types, and also contributes to building and improving [ #TensorFlow ] in lots of different and interesting ways,” says Jeff Dean, one of Google’s most important engineers and a key player in the rise of its deep learning tech.

In recent years, other companies and researchers have also made huge strides in this area of #AI, including #Facebook, #Microsoft, and #Twitter. And some have already open sourced software that’s similar to TensorFlow. This includes Torch—a system originally built by researchers at New York University, many of whom are now at Facebook—as well as systems like Caffe and Theano. But Google’s move is significant. That’s because Google’s AI engine is regarded by some as the world’s most advanced—and because, well, it’s Google.

“This is really interesting,” says Chris Nicholson, who runs a deep learning startup called Skymind. “Google is five to seven years ahead of the rest of the world. If they open source their tools, this can make everybody else better at machine learning.”

To be sure, Google isn’t giving away all its secrets. At the moment, the company is only open sourcing part of this AI engine. It’s sharing only some of the algorithms that run atop the engine. And it’s not sharing access to the remarkably advanced hardware infrastructure that drives this engine (that would certainly come with a price tag). But Google is giving away at least some of its most important data center software, and that’s not something it has typically done in the past.

Google became the Internet’s most dominant force in large part because of the uniquely powerful software and hardware it built inside its computer data centers—software and hardware that could help run all its online services, that could juggle traffic and data from an unprecedented number of people across the globe. And typically, it didn’t share its designs with the rest of the world until it had moved on to other designs. Even then, it merely shared research papers describing its tech. The company didn’t open source its code. That’s how it kept an advantage.

http://www.wired.com/2015/11/google-open-sources-its-artificial-intelligence-engine/