In deep learning, more functions are better

Mobile & apps

A highlight of WWDC 2019 - at least for developers - was Apple's announcement that version 3 of the Machine Learning Framework CoreML will be executed on the system platforms tvOS, macOS, watchOS, iOS and iPadOS and now also machine learning directly on the device, e.g. for personalization purposes ("Model Personalization").

Measured against this, the news and improvements in CoreML have turned out to be much more modest this year. It even looks as if Apple dropped the version number: If last year's update was called CoreML 3, it is now simply called Core ML - even if the CoreMLTools indicate version 4.

Apple CoreML (4): This is new

Speaking of CoreML tools: Creating your own models with the Create ML ML framework is fun for simple projects. In practice, however, models from TensorFlow or PyTorch are more common. To use such a model with Core ML, app developers must first convert it to the mlmodel file format. This is what the CoreMLTools mentioned above are intended for.

With the new CoreML version there is an interesting possibility to exchange the models (mlmodel file) instead of publishing them with an app update. In fairness it has to be said that this is not a new idea, several third party SDKs already offer it. The big advantage of the in-house solution is that the models are hosted in the Apple cloud. Since app developers may have more than one model in their app, the new concept of model collections allows them to group multiple models using the CloudKit dashboard. To prepare a core ML model for cloud deployment, there is now a Create Model Archive button in Xcode. This creates a .mlarchive file. If you take a look at this mlarchive, you will notice that it is apparently "only" a zip file with the mlmodelc folders. Developers can upload these to the CloudKit dashboard and include them in a collection of models.

Another nice new feature is that you can provide different model collections for different app users. For example, the camera on the iPhone is different from the camera on the iPad, so you might want to create two versions of a model and send one to iPhone users of the app and the other to iPad users. App developers can define targeting rules for the device class (iPhone / iPad / TV / Watch), the operating system and the version, the regional code, the language code and the version of the app. A new model version is not always made available immediately. Rather, at some point the app "recognizes" independently that a new model is available, downloads the model automatically and stores it in its sandbox. However, app developers seem to have no control over when and how this happens.

While this is a convenient solution and app developers don't have to worry about hosting the models themselves, keep in mind that the models are in CloudKit. This puts a strain on the app publisher's storage quotas and the downloading of the models is counted as network traffic.

Machine learning model: encryption against thieves

Up until now, it was really easy to steal your core ML model and use it in your own app. This changes with iOS 14 / macOS 11.0: The new CoreML can automatically encrypt and decrypt models so that strangers can no longer look into your mlmodelc folder. You can use encryption with or without the new CloudKit deployment. The Xcode development environment encrypts the compiled model, mlmodelc, and not the original mlmodel file. The model always remains in encrypted form on the user's device. Only when the app instantiates the model, CoreML decrypts it automatically. This decrypted version only exists in memory, it is not saved as a file anywhere. In order for this to work, an app developer first needs an encryption key. Xcode offers a button for this. If a developer selects this button, Xcode generates a new encryption key, which it links to the Apple development team account.

In concrete terms, this process creates an .mlmodelkey ​​file, of which every app developer receives a local copy to work with. App developers do not have to embed this encryption key in their app - nor should they! The reason for this: The key is also stored on Apple's servers. In order to decrypt the model when instantiated by the app, CoreML must fetch the encryption key from Apple's servers over the network. Accordingly, if the network fails before the encryption key has been downloaded, the application cannot instantiate the CoreML model. For this reason, app developers should use the new YourModel.load () function. This has a completion handler with which an app can react to loading errors. The error code modelKeyFetch indicates that Core ML was not able to load the decryption key from Apple's servers.

  1. Facebook faces
    Computers can learn to distinguish human faces. Facebook uses this for automatic face recognition.
  2. Machine learning
    Contrary to what the picture suggests, machine learning is a sub-area of ​​Artificial Intelligence - albeit a very important one.
  3. AlphaGo
    Machine beats people: In 2016, Google's machine learning system AlphaGo defeated the world champion in the game Go.
  4. Graphics processors GPU Nvidia
    The leading companies in machine learning use graphics processors (GPUs) - for example from Nvidia - for the parallel processing of data.
  5. Deep learning
    Deep learning processes first learn low-level elements such as brightness values, then medium-level elements and finally high-level elements such as whole faces.
  6. IBM Watson
    IBM Watson integrates several artificial intelligence methods: In addition to machine learning, there are algorithms for natural language processing and information retrieval, knowledge representation and automatic inference.

Machine Learning: Other Apple Frameworks

There are several high-level frameworks in the iOS / iPadOS SDKs that perform machine learning-related tasks. These are as old as CoreML itself. On the one hand, there is the vision framework, which also received a number of new functions. Vision already available via recognition models for faces, facial markings and human bodies, the following features are added in the new version:

  • Hand position detection (VNDetectHumanHandPoseRequest)

  • Detection of multi-person whole-body poses (VNDetectHumanBodyPoseRequest)

In particular, the recording of the recognition of multi-person full-body poses is an interesting feature. There are different open source models on the market, but they are not really good or slow. Commercial solutions, on the other hand, are expensive. In addition to static images, the Apple framework can also analyze videos, whether as a file or from the camera in real time.

Also new is VNDetectContoursRequest to recognize the outlines of objects in an image. These are then returned as vector paths. VNGeometryUtils, in turn, has auxiliary functions for post-processing the recognized contours, such as simplifying them to basic geometric shapes. It can be assumed that PencilKit also uses this basic geometric shape function for pen input. App developers can use Apple's Natural Language Framework for language processing. There are also some new functions here:

  • NLTagger and NLModel can now find multiple tags and predict their relationship.

  • Embedding sentences: After it was already possible to use word embedding, NLEmbedding now also supports sentences. A built-in neural network is used to encode the entire sentence in a 512-dimensional vector. This helps to grasp the context.

In the area of ​​the Speech & SoundAnalysis Framework, however, there does not seem to be any news. (mb / fm)