How to use adaptive width strings for localization on iOS

The Apple App Store is a huge market with potential users and customers from all over the world. With this global reach, it’s more important than ever that your app is ready to be used by people that don’t speak English. For example, at PSPDFKit, a PDF framework that is used by a lot of apps in the App Store, we localize our framework in 29 languages, so that users don’t see parts of an app in English and parts in their local language, which would be a bad experience.

One of the challenges of localization lies in the length of translated texts. Languages like German are especially problematic because of its longer texts compared to English. In addition to that, translators are often working with isolated strings, where the only context they get (if they get any) is the place where the string is going to be placed and its purpose, but they don’t usually have any idea about the available physical space on the screen. Moreover, the available space may not be constant as the same app may be run on an iPhone or an iPad (or even a Mac in the future?). This generally leads to a not so ideal user experience: either the text is cut with ellipsis on some devices, or the same text has different sizes depending on the language, or we decide to arbitrarily shorten the localized text for a particular language, creating a lot of wasted space for iPad Pro users, and penalizing them with less explanatory strings.

In order to solve this problem, if you don’t want to engineer your own solution Apple introduced “adaptive strings” with iOS 9. This feature is based on string dictionaries (.stringsdict files), which are commonly used to support pluralization rules in apps. If your app does not already have a .stringsdict file, you can create one using Xcode: Go to File, New, File (or press Command+N) and select “Stringsdict file” from the template list.

Screen Shot 2018-02-10 at 2.15.01 PM

Xcode dialog box to create a new Stringsdict file.

Once the Stringsdict file is created, you have to modify its contents so that it supports adaptive strings. For each key that you want to support multiple localizations, add a NSStringVariableWidthRuleType dictionary with key/value pairs, one for each “class” of screen width that you want to support. The key must be a number that represents in an abstract way the screen width (more on this later), and the value is the localized string. Here’s a sample .stringsdict file showing different possible welcome messages in Spanish:

Screen Shot 2018-02-10 at 9.22.49 PM

You can download a sample Xcode project from here. If you compile and run the app on an iPhone 5s, 6, and iPad Pro you’ll get three different experiences: You’ll read the shortest localized string (“Hi”/”Hola”) on an iPhone 5s, “Welcome”/”Bienvenido” on an iPhone 6, and “Welcome to my app”/”Bienvenido a mi app” on an iPad Pro.

What do the 20, 25, 50 numbers mean?

They look like “magic” numbers, but they do actually have a meaning: they intend to abstract the width that is available to show text. Let’s see how UILabel presents localized text on the screen when variable width strings are in place:

When you call NSLocalizedString("STRING_KEY", "String context"), this macro returns an NSString instance that you can set to a UILabel via its text property. Very simple API. However, remember that NSString is actually a class cluster, that is, a group of classes that are exposed via a single public abstract superclass. This design pattern was used in NSLocalizedString when the support for plurals was added. With the introduction of adaptive strings, the same design pattern was used, and what the function actually returns is an internal subclass of NSString, __NSVariableWidthString. When you set the text of a UILabel, the system internally queries if the NSString that you are passing is an instance of __NSVariableWidthString and, in that case, extracts an appropriate text variant. There’s a public category on NSString inside NSBundle to do this: variantFittingPresentationWidth(_:) Its public documentation is empty, but the header file explains a little bit more:


/*
For strings with length variations, such as from a stringsdict file, this method returns the variant at the given width.
If there is no variant at the given width, the one for the next smaller width is returned. And if there are none smaller,
the smallest available is returned. For strings without variations, this method returns self.
The unit that width is expressed in is decided by the application or framework. But it is intended to be some measurement
indicative of the context a string would fit best to avoid truncation and wasted space.
*/
– (NSString *)variantFittingPresentationWidth:(NSInteger)width API_AVAILABLE(macos(10.11), ios(9.0), watchos(2.0), tvos(9.0));

If you feel that the public documentation of this method should at least contain the comments of the header file where it is declared, please duplicate this Radar. UILabel calls this API when you set the text, but how does it generate a value for the width parameter? The header documentation explains that this heuristic is basically decided by the client. UILabel first extracts the width of the UIWindow where it is placed. Then, it creates an NSAttributedString instance with an uppercase “M” and a default font for body text. Dynamic type, introduced on iOS 7, helps with this by providing convenient API: let font = UIFont.preferredFont(forTextStyle: .body) The width parameter is calculated as the number of “M”s that can fit inside the UIWindow’s bounds width, that is, UIWindow.bounds.width / emWidth

The important thing that you need to remember about this heuristic is that it is directly proportional to the screen width and that the exact numbers you put in the .stringsdict file are not as important as the relationship between them. For example, you could also create two keys: 1, for small devices, and a bigger number like 30 to have a more detailed translation on devices with more available space. This is the approach that Apple follows for most of iOS stock apps.

Conclusion

To sum up, Apple provides support for adaptive localized strings via .stringsdict files in two ways:

  • If your codebase uses UIKit components, you don’t need to do anything, the system will show appropriate adaptive strings if you add a .stringsdict file in the way described in this article.
  • If you don’t use UIKit components or you need more flexibility, simply use the variantFittingPresentationWidth(_:) instance method of NSString to manually get the text from the desired key in the .stringsdict file. Something like this:


let string = NSLocalizedString("WELCOME_MESSAGE", comment: "This is the welcome message.") as NSString
let adaptedString = string.variantFittingPresentationWidth(25)

Note that variantFittingPresentationWidth(_:) also returns an instance of the internal type __NSVariableWidthString, so if you set the result of this API to the text property of UILabel, it will always be overridden by UIKit’s own heuristic. This has confused people in both Apple’s developer forums and StackOverflow. One possible solution to this is to use string interpolation:

welcomeLabel.text = "\(string.variantFittingPresentationWidth(25))"

And that’s it, I hope you enjoyed this article and maybe inspired you to build something on top of this API to improve your localization workflow.

Posted in iOS, Xcode | Tagged , , , | 3 Comments

Some useful URL schemes in Xcode 9

Not many people know that Apple introduced some interesting automation capabilities in Xcode 9 via URL schemes. I sometimes use them, and as I didn’t see them publicized anywhere, I decided to document them in this blog post.

Source Code Navigation

The new Xcode source editor, written in Swift, has a neat way to link between documentation and source code locations inside a project. First of all, you need to put this comment on top of the target location where you want to go, for example, on top of a method:

/// - Tag: MyAwesomeMethod
func myAwesomeMethod(...) {
...
}

After that, create a Markdown link in a file placed inside the project (the typical README file is a good candidate for this):

[View in Source](x-source-tag://MyAwesomeMethod)

Using this technique, a person who reads your README file and clicks on “View in Source” will navigate directly to the implementation of your awesome method. You can even open this tag URL in your browser and it will bring Xcode to the foreground and navigate to that symbol!

You can see that Apple has been silently using this feature in their recent sample code: https://developer.apple.com/library/content/samplecode/AudioInARKit/Introduction/Intro.html#//apple_ref/doc/uid/TP40017668-Intro-DontLinkElementID_2

Git

This is more or less a well-known feature of Xcode 9 (as it was announced in WWDC 2017): You can automatically clone repositories from GitHub by clicking on “Open in Xcode” in a repository that contains an Xcode project or workspace:

OpenInXcode

This feature is also supported by a URL scheme:

xcode://clone?repo=<URL_encoded_repository>

This will launch Xcode, clone the given repository, and open its main project or workspace, all in one step.

Hopefully, Apple will support git branches in this URL scheme in a future version of Xcode. The perfect use case is opening a code review in your IDE, directly from GitHub.

Devices and Simulators Management

I happen to open the Devices and Simulators pane in Xcode quite frequently (accessible from Window, Devices and Simulators), typically to open crash logs. This is the URL scheme that I use to automate this step and manage my devices more quickly:

xcdevice://showDevicesWindow

This URL scheme also accepts a parameter to automatically enable a particular device for development:

xcdevice://enableForDevelopment?identifier=<Device_Identifier>

Quick Access to Preferences

If I need to open quickly a particular tab in Xcode preferences (say Key Bindings or Components), I also use a URL scheme:

xcpref://GeneralPrefs (General tab)
xcpref://AccountsPrefs (Accounts tab)
xcpref://AlertPrefs (Behaviors tab)
xcpref://KeyBindingsPrefs (Key Bindings tab)
xcpref://FontAndColorPrefs (Font and Colors tab)
xcpref://NavigationPrefs (Navigation tab)
xcpref://LocationsPrefs (Location tab)

And that’s it, I hope you like these tips and hopefully streamline your Xcode workflow a little bit. Enjoy!

Posted in Xcode | Tagged , | 3 Comments

Having fun visualizing Swift’s sort()

Peter Naur, a renowned Danish computer scientist, said that programming is not merely the creation of a program, but an activity by which the programmers form a theory of the matters at hand. If you haven’t read his famous paper, I encourage you to reserve some time to do that. In my opinion, complexity prevents us from reaching a complete theory of a particular program, and many software engineering communities are exploring ways to avoid or reduce it (for example, by applying functional programming techniques to app development).

I’ve been recently exploring the domain of program visualization, that is, software on top of a debugger, or even isolated programs, whose only purpose is to help the programmer understand their code better. I think that better tools in this area would help programmers form a better theory of a complex program, and may reduce the number of bugs that are introduced and the time we spend debugging. To give you a more concrete example, I think that Apple’s recent memory graph debugger is a very neat abstraction that helps immensely when you are trying to understand the resource ownership graph of your program. It beats raw source code exploration and documentation by any measure.

The other day, I decided to have some fun with Swift arrays. Swift arrays have standard sorting routines to sort them, either in-place or by returning a new instance. Imagine that you want to understand which sorting algorithm is implemented by the standard library, and what steps are followed for a particular array. What alternatives do you have?

  • As the standard library is open source, you could simply read the implementation and try to understand the algorithms by yourself.
  • You could use sort(by:) and print the elements involved in each comparison.

The second bullet is what basic debugging is about, and I don’t know about you but I’ll probably need to have a pen and paper close by to draw the array, the elements, and understand how the sorting is happening, just like I did when I was studying sorting algorithms at university.

Can we do better? Any code that is heavily algorithmic certainly benefits from visualization. That’s why websites like https://visualgo.net are so popular among students and interested professionals. In order to show the contents of the array at each step, we may start searching GitHub for a good Swift image library. But there’s a poor man’s technique that can create images and even animated GIFs using plain print  statements in Swift if we follow the PPM format (the trade-off is that the image will be much bigger than a typical JPG or PNG). With tools like ImageMagick, you can even create an animated GIF from a sequence of PPM images.

A very simple Swift program that sorts a random array of 64 elements is this:


import Foundation
let sample = [11, 57, 55, 37, 54, 41, 30, 8, 53, 4, 47, 58, 18, 56, 17, 12, 39, 28, 16, 63, 40, 27, 50, 48, 19, 2, 25, 52, 13, 59, 64, 9, 26, 24, 23, 44, 21, 0, 6, 62, 61, 7, 29, 43, 38, 33, 51, 34, 3, 42, 22, 46, 5, 1, 10, 32, 60, 15, 49, 45, 35, 20, 36, 31]
sample.sort {
return $0 < $1
}

view raw

Sort.swift

hosted with ❤ by GitHub

The sorting closure is invoked with each pair of elements that the algorithm needs to compare so we could expand it to access the partially sorted array and create a PPM image that depicts its current state. The problem we will face is that the samplearray is being modified in place, so any simultaneous access to it is going to be prevented by Swift’s runtime module that tracks exclusivity (read the memory ownership manifesto for more information). One way to avoid this is to use Swift 3’s compatibility model, which downgrades this exception into a warning, or use raw UnsafeMutableBufferPointers:


import Foundation
var sample = [11, 57, 55, 37, 54, 41, 30, 8, 53, 4, 47, 58, 18, 56, 17, 12, 39, 28, 16, 63, 40, 27, 50, 48, 19, 2, 25, 52, 13, 59, 64, 9, 26, 24, 23, 44, 21, 0, 6, 62, 61, 7, 29, 43, 38, 33, 51, 34, 3, 42, 22, 46, 5, 1, 10, 32, 60, 15, 49, 45, 35, 20, 36, 31]
let elementWidth = 10
let elementHeight = 100
let markerHeight = 10
sample.withUnsafeMutableBufferPointer { ptr -> () in
ptr.sort {
vprintf("P6\n%d %d\n255\n", getVaList([elementWidth*ptr.count, elementHeight]))
for y in 0..<elementHeight {
let edge = y < markerHeight || y > elementHeight markerHeight
for x in 0..<elementWidth*ptr.count {
let elemIndex = x / elementWidth
let isSelected = elemIndex == ptr.index(of: $0) || elemIndex == ptr.index(of: $1)
var r = 0, g = 0, b = 0
if edge && isSelected {
r = 255
} else {
r = (ptr[elemIndex] * 255) / ptr.count
g = r
b = r
}
vprintf("%c%c%c", getVaList([r, g, b]))
}
}
return $0 < $1
}
}

The program works as follows: Line 11 prints the required header for a PPM image in binary format, including the frame width, height, and “255”, which is the maximum intensity value.

Lines 12 to 27 and 14 to 26 iterate through the array and print each value in two possible ways:

  • If the element is currently selected by the algorithm, print it in red.
  • If not, print it in a shade of gray whose intensity depends on the element’s value (so that we can see how the array is being sorted).

Using ImageMagick to assemble a GIF from the sequence of images generated by the above program, the result is this:

sort

Red squares represent the elements that are selected at each stage of the algorithm, and the different shades of gray represent the current value stored in each slot. The algorithm at the beginning is quicksort, a well-known sorting algorithm with a bad worst case whose time complexity is quadratic in the size of the array. To avoid this bad worst case, when the depth of the recursion tree is greater than a certain threshold the algorithm switches to heapsort, whose time complexity is O(n*logn) in the worst case. By inspecting the source code we can see that the threshold value is 2*floor(log(N)). Finally, when subarrays are small enough, the algorithm switches to insertion sort, which is very efficient for short arrays.

Swift’s sort is thus a hybrid algorithm that combines several well-known algorithms to avoid bad worst-case performance. The official name of this hybrid algorithm is introsort (invented in 1997), and the libc++ team is implementing the same algorithm for Clang. The number of frames in the GIF animation, 407, gives us the idea that the time it took to sort this random array of 64 elements is roughly proportional to O(n*logn), which is asymptotically optimal for general sorting algorithms that compare elements. However, there are instances that trigger bad performance from Swift’s sorting algorithm. Can you think of one? Do you want to play evil and use this technique to show what happens in those worst cases? 🙂

Posted in Algorithmics | Tagged , , | Leave a comment

Reverse Engineering macOS High Sierra Supplemental Update

Reported by Matheus Mariano, a Brazilian software developer, a programming error was discovered in Apple’s most recent operating system, High Sierra, that exposed passwords of encrypted volumes as password hints. A serious bug that quickly made the headlines in technology websites everywhere.

disk-utility-password-prompt-800x367

The dreaded password hint bug: Here, “dontdisplaythis” is the actual password.

 

Apple was prompt to provide macOS High Sierra Supplemental Update to customers via the App Store, and ensured that every distribution of High Sierra in their servers included this update.

I decided to apply a binary diffing technique to the update to learn more about the root cause of this bug and hypothesize about how the defect could have been prevented.

Inspecting the 51MB package, we can see that there are changes in the Disk Utility and Keychain Access apps, and also in related frameworks and command line tools:

Screen Shot 2017-10-08 at 11.53.25 AM

This post will focus only on the password hint bug, so our first step is to extract Applications/Utilities/Disk Utility.app/Contents/MacOS/Disk Utility  and to compare it with the same binary from a stock macOS 10.13 High Sierra. For this, I have written an Emacs extension that launches IDA whenever I load a Mach-O file in a buffer, generates a SQL database with information about the decompiled functions, loads the patched binary, and finally outputs a diff generated by Diaphora. This technique is useful for deconstructing binaries that have been updated by a minor patch release because there are usually just a few changes and common heuristics work well.

The diff between both versions of the Disk Utility binary revealed no differences in the decompilation:

VirtualBox_Windows 10_08_10_2017_13_20_16

That usually means that the only substantial changes reside in one of the linked frameworks. The most interesting one for this investigation is StorageKit, a private Apple framework that exposes APFS functionality to Disk Utility. It has two parts: a client library and a daemon, storagekitd. The client connects to the daemon using an Apple standard XPC mechanism. The daemon executes the operations (represented as subclasses of NSOperation) that the client demands. Here’s an interesting usage of StorageKit inside Disk Utility:

VirtualBox_Windows 10_08_10_2017_13_42_24

Reference to a StorageKit structure from controller code in Disk Utility.

This is part of the code that runs when you add a new APFS volume from the Disk Utility interface (concretely, the controller responsible for managing the new volume sheet).

Diffing StorageKit provided much more interesting results:

VirtualBox_Windows 10_08_10_2017_14_00_13

​​​[SKHelperClient addChildVolumeToAPFSContainer:name:caseSensitive:minSize:maxSize:password:passwordHint:progressBlock:completionBlock:] was one of the functions modified by the supplemental update. Inspecting the differences in decompilation revealed the actual bug:

VirtualBox_Windows 10_08_10_2017_14_15_16

In the picture above, the old, vulnerable, StorageKit is diff’d against the updated one. Removed lines removed are depicted in red, added lines in green, and changes in yellow. The above function basically creates an instance of NSMutableDictionary (Cocoa’s representation of a hash table) and fills it with information about the volume. This dictionary is passed to addChildVolumeToAPFSContainer:optionsDictionary:handlingProgressForOperationUUID:completionBlock: as the optionsDictionary argument.

The most interesting keys in the dictionary are kSKAPFSDiskPasswordOption and kSKAPFSDiskPasswordHintOption, which are responsible for storing the password and the password hint, respectively. The bug is that the same variable, which contains the password, (represented in the decompilation as the same virtual register, v50) was used as value for both keys in the dictionary, meaning that the clear password was incorrectly sent as a password hint via XPC. In reconstructed Objective-C code, the bug would be something like this:

NSMutableDictionary *optionsDictionary = [NSMutableDictionary alloc] init];
[...]
optionsDictionary[kSKAPFSDiskPasswordOption] = password;

optionsDictionary[kSKAPFSDiskPasswordHintOption] = password;

Here’s the corrected function from the supplemental update:

Updated

Note that the correct variables for the password and the password hint are set.

This is an example of a common category of bugs where code with a common structure is copied and pasted but the developer forgets to make every required modification and consequently there’s a fatal change in behavior. If you are curious, this blog post shows you more examples of “Last Line Effect” bugs in open source software.

It’s important to emphasize that, although this particular dictionary is not stored anywhere (it’s simply used to pack the information that is sent to storagekitd), the fact that the password was sent incorrectly as password hint meant that storagekitd trusted its client and stored it as clear text, thinking it was a password hint.

Why did the bug not reproduce when using the command line?

This is a common question. Apparently, Disk Utility and command line diskutil use different code paths.  StorageKit does not appear as a direct dependency of diskutil, or in the transitive closure of its dependencies. Here’s otool -L output:

/usr/lib/libcsfde.dylib (compatibility version 1.0.0, current version 1.0.0)/usr/lib/libcsfde.dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/libCoreStorage.dylib (compatibility version 1.0.0, current version 1.0.0) /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 1443.14.0) /System/Library/Frameworks/IOKit.framework/Versions/A/IOKit (compatibility version 1.0.0, current version 275.0.0) /System/Library/PrivateFrameworks/DiskManagement.framework/Versions/A/DiskManagement (compatibility version 1.0.0, current version 1.0.0) /System/Library/Frameworks/DiscRecording.framework/Versions/A/DiscRecording (compatibility version 1.0.0, current version 1.0.0) /usr/lib/libncurses.5.4.dylib (compatibility version 5.4.0, current version 5.4.0) /System/Library/Frameworks/DiskArbitration.framework/Versions/A/DiskArbitration (compatibility version 1.0.0, current version 1.0.0) /usr/lib/libicucore.A.dylib (compatibility version 1.0.0, current version 59.1.0) /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.0.0) /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1443.13.0)

This duplication in what’s more or less the same functionality, while sometimes justified, certainly increases the opportunity for bugs.

How could this have been prevented?

There’s two engineering practices that help with bugs like this (but do not eradicate them completely):

Unit testing

Unit testing is the practice of creating software tests that exercise a single unit in a computer program, where “unit” is typically a class or module. Effective unit testing requires sensing outputs reliably and asserting that they are expected, so side effects from functions complicate unit testing a bit. In this particular bug, the side effect is the communication with the XPC service, so separating the logic that creates the dictionary from the part that communicates with the service would help. When a software design is not easily testable, companies rely excessively on manual testing, which is not a very effective way of testing, given the high number of combinations that is typical in modern software (did the QA engineer test setting a password *and* a password hint?, easily forgettable on a tight deadline).

Code review

Code review is the practice of reviewing code before or after it lands the main development branch in a software project. Code reviews should always be small, so that the reviewer’s attention is focused and can suggest better improvements and even spot bugs like this. A “last line” bug can easily be ignored if it’s part of a huge code review.

Conclusion

An unfortunate bug in macOS High Sierra stained a bit its generally well-received debut, and from this root-case analysis we can learn what happened exactly and how good software development practices (including testable design and strict code reviews) can help reduce the chance that this kind of problems happen again in the future.

 

Posted in Reverse Engineering | 18 Comments

Welcome to Cocoa Engineering!

NSLog(@"Welcome to Cocoa Engineering!")

This is the first post of a new blog that I’m starting about Cocoa engineering. Cocoa is the name of Apple’s object-oriented API for writing OS X and iOS applications, so, as you can imagine, I’ll write here about how to make applications for iPhone and iPad, describing its common patterns, giving API guidance, and some platform trivia.

I hope you enjoy my blog!

Posted in No category | Leave a comment