Coverage by Bhat Dittakavi of Variance.AI on “Intelligent and Self-driving Cloud?” by Joydeep Sama of Qubole India at NASSCOM AI Summit on 23rd June 2017
Joydeep Sen Sama
Can cloud learn itself? Can cloud become smart? Can AI build Self-driving Cloud?
Any cloud challenges can be split into the following 7 areas. I believe AI can help make the cloud intelligent in first five areas. Automation of alerts and troubleshooting, I believe, is something that AI can’t do by itself for now.
1) Server Scaling (Can be automated)
2) Server Configuration (Ability to instantiate a node for work load)
-What machine to use when for what work load?
3) Data Management (Can be automated)
-What indexes to create?
-What datasets to de-normalize?
-What data cubes to pre-create?
-How to leverage analysis that has been done before?
We can automate this.
4) Job Configuration Management (Can be automated)
We have to track the following parameters.
-Thread for executor
-Memory for executor
-Threads for core
-# of reducers (Big fata)
1) Cost models on historical data
2) Continuous exploration of solution space (Patterns keep changing and hence)
We could analyse data base processes across many customers and what SQL tables are joining most frequently. We looked at top k join groups so DBAs don’t have to worry.
5) User Experience (Easy to automate)
How long will query take? -Forecasting
How to leverage analysis that has been done before? -Query Clustering
If cloud forecasts the time it takes to get the query answered and gives expected turnaround time, user can probably take a coffee break and come back for the results.
6) Altering and Monitoring (Not easy to automate)
Common Practice: Threshold based alerts
-Right thresholds keep changing
-Too many false positives or false negatives
Need to move to learning based alerts
-Current products are behind the times.
I get an alert and I ignore it as it is not matrrial and self-driving cloud has to see this as some kind of spam, similar to the way email gets classified, and classify alerts and prioritize. I haven’t seen this happening yet.
7) Troubleshooting (Not easy to automate)
Even few humans are able to troubleshooting well as there are layers of systems. There are always situations where data is not captured and domain knowledge is required.
Will AI reduce training opportunities for human More automation means we as humans tend to forget or unequipped. Is it good? Question to ponder.
Q) Who bears the risk of cloud? Vendor or consumer?