apple

Word Prediction App Part 1b: Morphing the TableView Application to a Word Prediction App (Some ideas)

In the previous application I wrote, I explored the apple IOS GUI and MVC structure. To take the app further I decided it would be cool to turn it into a word prediction application.

There are 2 parts to predicting words, 1 is determining which word could fill in the sequence and 2 is learn it. Now learning algorithms and providing training data is a whole other game and field but definitely something I look forward to trying in the future. For now the goal is simple, use the character sequence provided in our “filter” to organize our words by best fit. The current plist I have contains approximately 200,000 words. Now according to some google searching, there are over a million to actually choose from so the app maybe a bit behind in word choices. Nevertheless it is still a fun task to try!

So, how what do we do once we have a character sequence provided? Well here is how I think it could work.

If we have a character sequence “TH” we want to filter our list to a sub-list that only contains the characters TH as a prefix.

Here is some sample code:

- (NSMutableArray *)sortWithSquence:(NSString *)sequence
{
    if([sequence isEqualToString:@""]) //if no filter, return the regular list
    {
        return _tableData;
    }

    sequence = [sequence uppercaseString]; //the sequence will be case sensitive and the list is all uppercase
    //remake the filtered list
    [_filteredList removeAllObjects];

    for(NSString *str in _tableData) //loop through entire set each time for now ..
    {
        if([str hasPrefix:sequence]) //create a sub group of words that contain the prefix
        {
            [_filteredList addObject:str];
        }
    }
    return _filteredList;
}

Now the boring thing to do would be to return the top 3 elements in the new filtered list and offer them as options. When searching for “the” the top 3 searches are “the”, “thea”, “theacea”.

Now I don’t think these are really common options and since I do not have commonality statistics to sort by I am going to only fine elements that contain the sequence and are of the same length of the sequence. If no words are of the same length I will add 1 to the length and look again. I will repeat the process until I find words that fall under the shortest length and fill a mini group of 3. In theory this should help me find words like “then” and “their”.

Now in the following code you will notice I first create a sub-list then perform the little algorithm over the sub-list. I will later have my list organized into subarrays based on the first letter. This will greatly reduce sorting through unwanted areas. Ideally there will be a sub-array for each letter of the alphabet and the words will be sorted by shortest to longest length for ease of prediction.

- (NSMutableArray *)sortWithSquence:(NSString *)sequence
{
    if([sequence isEqualToString:@""]) //if no filter, return the regular list
    {
        return _tableData;
    }

    sequence = [sequence uppercaseString];
    NSUInteger lenCompare = [sequence length]; //get an inital length and add one to predict next possible word
    NSMutableArray *tempFilteredList = [[NSMutableArray alloc] init];
    //remake the filtered list
    [_filteredList removeAllObjects];

    for(NSString *str in _tableData) //loop through entire set each time for now ..
    {
        if([str hasPrefix:sequence])
        {
            [tempFilteredList addObject:str];
        }
    }

    NSInteger repeatCounter = 0;
    while([_filteredList count] < 3) //keep going till we have 3 elements     {         NSUInteger curListSize = [_filteredList count];         for(NSString *str in tempFilteredList) //loop through the new smaller filtered list         {             //ok we have the sequence lets check if its the right length             if([str length] == lenCompare)             {                 NSLog(@"Found match: %@", str);                 [_filteredList addObject:str];                 //stop when we have 3 matches                 if([_filteredList count] == 3)                 {                     return _filteredList;                 }             }         }         lenCompare++; //go up by 1 length         if(curListSize == [_filteredList count] && repeatCounter > 1)
        {
            //no new matches found with inc compare size, stopping
            NSLog(@"No new matches found for %@", sequence);
            return _filteredList;
        }
        else
        {
            NSLog(@"Not enough matches found, inc lencompare");
        }
        repeatCounter++;

    }
    return _filteredList;

I also added a repeat counter to prevent continuously searching for new words if none are found.

Now here is a look at the final product:

Word Prediction

This concludes the changes for now, I will branch the textview into a word prediction repo. To it I will add the initial model word organization for faster sorting time. I will add system to make sentences and use the list to help complete the wording.

Note I will not really target core linguistics here but I will add thoughts on what I would do if I had a pretty little(big) database.

[“word”][“type of word”][“commonality”][“linking word”][“most-used-in-conjunction-with-reference”]

Having data like that would allow a predictor to utilize many factors into determining which word is most likely to finish the sequence. The other key factor is learning and from a reading by Ray Kurzweil, his approach is using Hierarchal Hidden Markov Models or HHMM for short. These learning algorithms allow letters, words and sentences (referred to as nodes) to create links to one another. These nodes form branches or links to one another with probability values.

For example:

The character sequence “APP” may form links to “APPLE, “APPLICATION”, “APPLY”. Each link can be described by a probability value.

APP — 0.5 –> APPLE
— 0.3 –> APPLICATION
— 0.3 –> APPLY

Using Markov models we can denote that apple should be the first choice. In essence we are looking for the translation from one state to another state. This is the simpliest version of the Markov model known as Markov Chains.

Here is a nice video explaining .

Word Prediction App Part 1: IOS TableView Filter Application

As one of my first apps in learning how to program in IOS with Objective-C, I decided to take a look into list filtering. Coming from learning Android first, I was interested in how IOS would tackle the entire design process. To keep things simple, the application is a single column, single section, multi-row tableview. This is just a listview in most other frameworks but under the hood it is just a tableview.

Using Apple’s MVC design scheme, this application includes a model that will handle list filtering and creating the initial data from a plist. The model will also provide a pointer to a filtered list which the controller will use to update the view.

All code can be found here: TableView Filter Application BitBucket


Let’s start out with the layout since Apple makes design-first so easily and fun. Here is the layout I chose for the application:

Storyboard Capture

We place a textview as our filter at the top of the application, a label to display all matches we find and finally the tableview as the body of the application.

In our controller class we need to tie our UI to the controller outlets and actions. The IBAction for the textview we are looking for is valueChanged, this will simulate a “key up” event most people are used too.


@interface ListFilterViewController ()

//our outlets for the UI objects
@property (nonatomic, readwrite, weak)IBOutlet UITextField *tvFilter;
@property (strong, nonatomic)IBOutlet UITableView *tableView;
@property (nonatomic, readwrite, weak)IBOutlet UILabel *matchLbl;

//our model objects
@property (strong, nonatomic)ListModel *listModel;
@property (strong, nonatomic)NSMutableArray *filterdList;

//actions from filter
- (IBAction)filterChanged:(UITextField *)sender;

@end

In order to connect the tableview we need to extend the following:

@interface ListFilterViewController : UIViewController <UITableViewDelegate, UITableViewDataSource>
//in more complex apps, each should have its own controller 
@end

This will allow us to implement the following 3 methods in our controller:
Utilizing the below, a simple list view should be viewable with 5 rows of “Cell Text”.

//table view

- (NSInteger)numberOfSectionsInTableView:(UITableView *)tableView
{
    //return the sections we need, just 1 for this example since we want a single list
    return 1;
}

- (NSInteger)tableView:(UITableView *)tableView numberOfRowsInSection:(NSInteger)section
{
    //return the number of rows in section, we are just going to put our array size here
    return 5;
}

- (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath
{
    //here we create a simple identifier for reusability 
    static NSString *cellID = @"SimpleID";
    
    UITableViewCell *cell = [tableView dequeueReusableCellWithIdentifier:cellID];
    
    if (cell == nil) { //check if its nil, if it is we need to create a simple cell with the default style
        cell = [[UITableViewCell alloc] initWithStyle:UITableViewCellStyleDefault reuseIdentifier:cellID];
    }
     
    cell.textLabel.text = "Cell Text"; //set the inner cell label text

    return cell; //finally return the individual cell 
}

For the model I created a simple ListModel class that creates an NSMutableArray and initializes with a plist containing approximately 140,000 words.

Here is the .h file

#import <Foundation/Foundation.h>

@interface ListModel : NSObject

@property (nonatomic, readwrite, strong) NSMutableArray *tableData;
@property (nonatomic, readwrite, strong) NSMutableArray *filteredList;

- (NSMutableArray *)sortWithSquence:(NSString *)sequence;

@end

And here is the .m file


#import "ListModel.h"

@implementation ListModel

@synthesize tableData = _tableData;
@synthesize filteredList = _filteredList;

- (id) init
{
    if(self == [super init])
    {
        NSLog(@"Creating table data");
        
        // Path to the plist (in the application bundle)
        NSString *path = [[NSBundle mainBundle] pathForResource:
                          @"words" ofType:@"plist"];
        // Build the array from the plist
        _tableData  = [[NSMutableArray alloc] initWithContentsOfFile:path];
        //create a 2nd array that will contain the filtered version 
        _filteredList = [[NSMutableArray alloc] init];
        NSLog(@"Created %lu rows", _tableData.count);
    }
    return self;
}

- (NSMutableArray *)sortWithSquence:(NSString *)sequence
{
    if([sequence isEqualToString:@""]) //if no filter, return the regular list
    {
        return _tableData;
    }
    
    //remake the filtered list
    [_filteredList removeAllObjects];
    
    //O(n)
    //for simplicity purpose, the app first used brute force methods for the array
    //future renditions of this code will utilize core data for data sorting as core data stores the data
    //in memory for quickness and will provide a wrapper to encapsulate the need for arrays
    for(NSString *str in _tableData) //loop through entire set each time for now ..
    {
        NSRange textRange;
        textRange =[str rangeOfString:sequence options:NSCaseInsensitiveSearch];
        
        if(textRange.location != NSNotFound)
        {
            [_filteredList addObject:str];
        }
    }
    return _filteredList;
}

@end

For the filter method, we first check if the sequence is empty and if it is, return the pointer to the NSMutableArray that contains all the words. This prevents us from remaking a list that is full.
Next the filtered list calls a remove all objects method to clear itself (but not deallocate it). Since we have a strong pointer and the object itself is no longer pointing to the previous memory locations, they can be cleaned up. The ARC value should be 0 at this point. To verify this, I ran through each letter to re-create 26 filtered lists and watched the memory inspector provided by XCode.

Memory Utilization

We can see that this filter method does have a high initial memory usage since we start the app around 32mb. But creating/recreating the filtered lists brings it up to about 36mb peak and then drops back down after a short period of time. Using tools like this during your application coding process is very important to capture memory leaks and poor memory usage.


The listmodel will be allocated and initialized within the controller class along with the table data source and delegate in the viewDidLoad event. Notice I have also added the filtered list initializations here as well as defaulted the match label text.


- (void)viewDidLoad
{
    [super viewDidLoad];
    NSLog(@"View did load, init of data");
    _listModel = [[ListModel alloc] init];
    _filterdList = _listModel.tableData; //init to data source at first
    _matchLbl.text = [NSString stringWithFormat:@"Matches: %lu", _filterdList.count];
    _tableView.dataSource = self;
    _tableView.delegate = self;
    
}

Next we can setup the filter event to re-filter our array every time the textfield value changes, this is also where the label can be updated.

- (IBAction)filterChanged:(UITextField *)sender
{
    _filterdList = [_listModel sortWithSquence:sender.text]; //get our sorted list
    _matchLbl.text = [NSString stringWithFormat:@"Matches: %lu", _filterdList.count]; //update our label
    [_tableView reloadData]; //reload the table data
}

Finally we can implement the 3 table methods properly:

- (NSInteger)numberOfSectionsInTableView:(UITableView *)tableView
{
    return 1;
}

- (NSInteger)tableView:(UITableView *)tableView numberOfRowsInSection:(NSInteger)section
{
    return _filterdList.count;
}

- (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath
{
    //here we create a simple identifier for reusability 
    static NSString *cellID = @"SimpleID";
     
    UITableViewCell *cell = [tableView dequeueReusableCellWithIdentifier:cellID];
     
    if (cell == nil) { //check if its nil, if it is we need to create a simple cell with the default style
        cell = [[UITableViewCell alloc] initWithStyle:UITableViewCellStyleDefault reuseIdentifier:cellID];
    }
     
    NSString *cellText = [_filterdList objectAtIndex:indexPath.row]; 
    cell.textLabel.text = cellText; //set the inner cell label text

    return cell;
}

The final product will look like this:

App Sample

App Sample Filtered

Not sure what nonpariello is but it has my name in it so it must be something important.