LD3320 Chinese Speech Recognition and MP3 Player Module

August 27, 2014, 5:43 am

≫ Next: English text to speech on a PIC microcontroller

≪ Previous: SYN6288 Chinese Speech Synthesis Module

After my previous success in getting the SYN6288, a Chinese text-to-speech IC, to produce satisfactory Chinese speech and pronouncing synthetic English characters, I purchased the LD3320, another Chinese voice module providing speech recognition as well as MP3 playback capabilities.

The module's Chinese voice recognition mechanism can be initialized with the Pinyin transliterations of the Chinese text to be recognized. The module will then listen to the audio sent to its input channel (either from a microphone or from the line-in input) to identify any voice that resembles the programmed list of Chinese words sent during initialization. Audio during MP3 playback is sent via the headphone/lineout (stereo) and speaker (mono) pins. Data communication with the module is done using either a proprietary parallel protocol or SPI.

The board I purchased comes with a condenser microphone and 2.54mm connection headers for easy prototyping:

Board Schematics

The detailed schematics of the board is below:

The connection headers on the breakout board expose several useful pins, namely VDD, GND, parallel/SPI communication lines and audio input/output pins. The detailed pin description can be found below, where ^ denotes an active low signal:

VDD          3.3V Supply
GND          Ground
RST^         Reset Signal
MD           Low for parallel mode, high for serial mode.
INTB^        Interrupt output signal
A0           Address or data selection for parallel mode. If high, P0-P7 indicates address, low for data.
CLK          Clock input for LD3320 (2-34 MHz).
RDB^         Read control signal for parallel input mode
CSB^/SCS^    Chip select signal (parallel mode) / SPI chip select signal (serial mode).
WRB^/SPIS^   Write Enable (parallel input mode) / Connect to GND in serial mode
P0           Data bit 0 for parallel input mode / SDI pin in serial mode
P1           Data bit 1 for parallel input mode / SDO pin in serial mode
P2           Data bit 2 for parallel input mode / SDCK pin in serial mode
P3           Data bit 3 for parallel input mode
P4           Data bit 4 for parallel input mode
P5           Data bit 5 for parallel input mode
P6           Data bit 6 for parallel input mode
P7           Data bit 7 for parallel input mode
MBS          Microphone Bias
MONO         Mono Line In
LINL/LINR    Stereo LineIn (Left/Right)
HPOL/HPOR    Headphone Output (Left/Right)
LOUL/LOUTR   Line Out (Left/Right)
MICP/MICN    Microphone Input (Pos/Neg)
SPOP/SPON    Speaker Ouput (Pos/Neg)

The LD3320 requires an external clock to be fed to pin CLK, which is already provided by the breakout board via a 22.1184 MHz crystal. No external components are needed, even for the audio input/output lines, as the breakout board already contains all the required parts.

To use SPI for communication, connect MD to VDD, WRB^/SPIS^ to GND and use pins P0, P1 and P2 for SDI, SDO and SDCK respectively. For simplicity, the rest of this article will use SPI to communicate with this module.

Official documentation (in Chinese only) can be found on icroute's website. The Chinese datasheet can be downloaded here. With the help of onlinedoctranslator, I made an English translation, which can be downloaded here.

Breakout board issues

Before you proceed to explore the LD3320, please be aware of possible PCB issues causing wrong signals to be fed to the IC and resulting in precious time wasted debugging the circuit. In my case, after getting the sample program to compile and run on my PIC microcontroller only to find out that it did not work, I spent almost a day checking various connections and initialization codes to no avail. I could easily have debugged till the end of time and still could not get it to work if I hadn't noticed by chance a 22.1184 MHz sine wave on the pin marked as WRB, raising suspicion that the PCB trace may have issues.

I decided to use a multimeter and cross-check the connections between the labelled pins on the connection headers and the actual pins on the IC while referring to the LD3320 pin configuration described in the datasheet:

This is the pin description printed on the connection header at the back of the board:

To my surprise, apart from the GND/VDD pins which are fortunately correctly labelled (otherwise I could have damaged the module by applying power in reverse polarity), the rest of the pin labels on the left and right columns of the left connection header are swapped! For example, RSTB should be INTB, CLK should be WRB and vice versa. This explained why I got a clock signal on the WRB pin as their labels are swapped! The correct labelling for these pins should be:

For the right and bottom connection headers, the labelling is correct. However, further tests showed that the condenser microphone is connected in reverse polarity and that there are several other connection issues between the microphone and the LD3320. The connections on the PCB did not seem to match the board schematics, which could indicate a faulty PCB or a mismatched schematics. Either way, the microphone input still could not work even with the ECM replaced, and I could only get it to work using the line-in input (more on that later) after removing the ECM from the board. The presence of the microphone, even if unused, will disturb the line-in input channel and prevent the module from working.

Therefore, before you apply power to the board, check to make sure that the pin labelling is correct - or at least check that the VDD and GND pins are correctly labelled. Also, your board may not have any issue or have a different issue than those described above.

Speech recognition

The only few examples I found for this IC are from coocox's LD3320 driver and some 8051 codes downloadable from here. By comparing the codes with the initialization protocol provided in the datasheet, the steps to use this module can be summarized below:

1. Reset the module by pulling the RST pin low, and then high for a short while.
2. Initialize the module for ASR (Automatic Speech Recognition) mode. In particular, set the input channel to be used for speech recognition.
3. Initialize the list of Chinese words to be recognized. For each Chinese word, send the Pinyin transliteration of the word (without tone marks) in ASCII (e.g. bei jing for 北京) and an associated code (a number between 1 and 255) to identify this word. The codes for the words in the list need not be continuous and multiple words can have the same identification code.
4. Look for an interrupt on the INTB pin, which will trigger when a voice has been detected on the input channel.
5. When the interrupt happens, instruct the LD3320 to perform speech recognition, which will analyse the detected voice for any patterns similar to the list of Chinese words programmed in step 3. If a match is found, the chip will return the identification code associated with the word.
6. After a speech recognition task is completed, go back to step 1 to be ready for another recognition task.

To specify which input channel will be used for speech recognition, use register 0x1C (ADC Switch Control). Write 0x0B for microphone input (MICP/MIN pins), 0x07 for stereo input (LINL/LINR pins) and 0x23 for mono input (MONO pins).

In my tests, as the microphone input channel cannot be used due to the PCB issues mentioned above, I used the stereo input channels with an ECM and a preamplifier circuit based on a single NPN transistor. The output of this circuit is then connected to the LINL/LINR audio input pins of the LD3320. Below is the diagram of the preamplifier:

To achieve the highest recognition quality possible, several registers of the LD3320 are used to adjust the sensitivity and selectivity of the recognition process:

Register 0x32 (ADC Gain) can be set to values between 00 and 7Fh. The greater the value, the greater the input audio gain and the more sensitive the recognition. However, higher values may result in increased noises and mistaken identifications. Set to 10H-2FH for noisy environment. In other circumstances, set to between 40H-55H.
Register 0xB3 (ASR Voice Activity Detection). If set to 0 (disable), all sounds detected on the input channel will be taken as voice and trigger the INTB interrupt. Otherwise, INTB will only be triggered when a voice is detected on the audio input channel whereas other static noises will be ignored. Set to a value between 1 and 80 to control the sensitivity of this detection - the lower the value, the higher the sensitivity. In general, the higher the SNR (signal-to-noise) ratio in the working environment, the higher the recommended value of this register. Default is 0x12.
Register 0xB4 (ASR VAD Start) defines how long a continuous speech should be detected before it is recognized as voice. Set to value between 1 and 80 (10 to 800 milliseconds). Default is 0x0F (150ms).
Register 0xB5 (ASR VAD Silence End) defines how long a silence period should be detected at the end of a speech segment before the speech is considered to have ended. Set to 20-200 (200-2000 ms). Default is 60 (600 ms).
Register 0xB6 (ASR VAD Voice Max Length) defines the longest possible duration of a detected speech segment. Set to 5-200 (500ms-20sec). Default is 60 (6 seconds)

After initializing the LD3320 according to the datasheet and tweaking the speech recognition setup registers, I could get the LD3320 to recognize Chinese proper names such as bei jing (北京) and other words like a li ba ba. The quality of the recognition is satisfactory.

MP3 playback

The LD3320 also supports playback of MP3 data received via SPI. Playback is done using the following steps:

1. Reset and initialize the LD3320 in MP3 mode.
2. Set the correct audio output channel for audio playback.
3. Send the first segment of the MP3 data to be played.
4. Check if the MP3 has finished playing. If so, stop playback.
5. If not, continue to send more MP3 data and go back to step 4.

Three types of audio output are supported: headphone (stereo), line out (stereo), or speaker (mono). The headphone and speaker channels are always enabled whereas the speaker channel must be enabled independently. Line out and headphone output volume can be adjusted by writing a value to bits 5-1 of registers 0x81 and 0x83 respectively, with 0x00 indicating maximum volume. Speaker output volume can be changed by writing to bits 5-2 of register 0x83, with 0x00 indicating maximum volume.

According to the datasheet, the speaker output line can support an 8-ohm speaker. However, in my tests, connecting an 8-ohm speaker to the speaker output will cause the module to stop playback unexpectedly, presumably due to high power consumption, although the sound quality through the speaker remains clear. The headphone and line out channels seem to be stable and deliver good quality audio.

I also tried to connect a PAM8403 audio amplifier to the line-out channel to achieve a stereo output using two 8-ohm speakers. At first, with the PAM8403 sharing the same power and ground lines with the LD3320, the same issue of unexpected playback termination persisted, even with the usage of decoupling capacitors. Suspecting the issue may be due to disturbance caused by the 8-ohm speaker sharing the same power lines, I used a different power supply for the PAM8403 and the LD3320 managed to play MP3 audio smoothly with no other issues.

Demo video

I made a video showing the module working with a PIC microcontroller and an ST7735 128x160 16-bit color LCD to display the speech recognition results. It shows the results of the module trying to recognize proper names in Chinese(bei jing北京, shang hai上海, hong kong香港, chong qing重庆, tian an men天安门) and other words such as a li ba ba. A single beep means that the speech is recognized while a double beep indicates unrecognized speech. Although the speech recognition quality highly depends on the input audio, volume level and other environmental conditions, overall the detection sensitivity and selectivity seems satisfactory as can be seen from the video.

The end of the video shows the stereo playback of an MP3 song stored on the SD card - using a PAM8403 amplifier whose output is fed into two 8-ohm speakers. Notwithstanding the background noises presumably due to effects of breadboard stray capacitance at high frequency (22.1184 MHz for this module), MP3 playback quality seems reasonably good and comparable to the VS1053 module.

The entire MPLAB X demo project for this module can be downloaded here.

See also

SYN6288 Chinese Speech Synthesis Module
Interfacing VS1053 audio encoder/decoder module with PIC using SPI

↧

English text to speech on a PIC microcontroller

September 21, 2014, 9:05 pm

≫ Next: Tektronix 1230 Logic Analyzer

≪ Previous: LD3320 Chinese Speech Recognition and MP3 Player Module

I have always been a fan of the TTS256 - a tiny but great English text-to-speech IC based on a 8-bit microprocessor for embedded voice applications. Unfortunately, the TTS256 has been out of production for a long time and despite better technology being developed over the years, chip manufacturers do not seem to be interested in developing a similar or better text to speech IC, leaving the average electronics hobbyists searching eBay for second-hand TTS256 ICs, often listed at unreasonable price.

Nowadays, SpeakJet by Sparkfun and RoboVoice by Speechchips are among the few available text-to-speech modules for embedded projects. Both are priced at 20-30 USD and have pinout and interface commands similar to the TTS256. Although these speech modules come in handy, their price range seems a bit high for many projects. Hence, I decided to search for free alternatives.

Syntho and PICTalker

There are several open source text-to-speech projects for 8-bit microcontrollers such as Syntho and PICTalker, built for the PIC16F616 and PIC16F628 respectively. In both projects, one or more EEPROMs are used to store the phoneme database. The EEPROM size is around 64K for the PICTalker project and is made smaller by using innovative compression techniques in the Syntho project. Both projects require phonemes (and not English text) to be sent before they can be pronounced. This is due to the lack of a rule database to convert text to phonemes, presumably due to the limited amount of memory available. These solutions are somewhat closer to the SPO256, which requires phonemes as input, rather than the TTS256, which accepts English text.

If you don't know what phonemes are, read this on Wikipedia. They are simply the phonetic representations of a word's pronunciation. There are approximately 44 phonemes in English to represent both vowels and consonants.

Below is a voice sample of the Syntho project, which is trying to say "I am a really cheap computer":

As expected, the voice sounds too mechanical and can hardly be understood.

Arduino TTS library

Next I came across another TTS library made for the Arduino and decided to give it a quick try to test the speech quality. As I do not have an Arduino board available, I ported them to a Visual Studio 2012 C++ application which accepts English text as input and saves the resulting speech as a wave file. The ported code can be downloaded here. If you intend to use this code, take note that it only writes the wave data for the generated speech and ignores the wave file header. You will probably need a professional sound editor software such as Goldwave to play and examine the generated file. This is because calculating the exact total duration of the generated speech (required for creating the wave file header) is complicated and there is no need for me to attempt that since the code is only for testing.

This is the generated voice sample. It is trying to say "Hello Master, how are you doing? I am fine, thank you.":

Although the quality is obviously better than the PICTalker, it still sounds robotic and difficult to understand.

SAM (Software Automatic Mouth) project

My next attempt is to see if the same can be done on a PIC, with better speech quality. By chance, I came across SAM (Software Automatic Mouth), a tiny (less than 39KB) text-to-speech C program. The project website contains a tool to generate a demo voice from the text entered.

After getting the Windows source code to compile and run without issues in Visual Studio (download the project here), I decided to port the code to the PIC24FJ64GA002, which is surprisingly rather straightforward. The only challenge was to get all the 32-bit data types in the original source code ported properly to the 16-bit architecture of the PIC24 micro-controller, and to get the rule and phoneme database fit nicely into the PIC24FJ64GA002 available memory. Fortunately, the entire project when compiled uses just around 50% of the total program and data memory on the PIC24FJ64GA002, leaving available space for other codes.

You may be able to fit the project into the smaller PIC24FJ32GA002 by changing the project build options to use the large memory model during compilation:

However, in my experiment, the code compiled but ran erratically when using large memory model, perhaps due to pointer behavior differences. It is therefore better to compile with the default settings and use the PIC24FJ64GA002 (or one with more memory) to save the trouble and have more code space for other purposes.

The following is a recording of the generated speech for the sentence "This is SAM text to speech. I am so small that I will work also on embedded computers", when running on the PIC24 using PWM for audio output.

Below is a longer demo speech. Can you understand what it is trying to say?

As can be seen, the quality of the generated speech is much better - less mechanical, clearer pronunciation and easier to understand. Although the voice still sounds robotic and there are some mispronounced words, the overall quality should be good enough to be used in embedded projects, as a free alternative to current commercial text to speech solutions.

With some pitch adjustments, the PIC24 can also sing "The Star-Spangled Banner", the national anthem of the United States of America:

The complete ported code, as a MPLAB 8 PIC24FJ64GA002 project, can be downloaded here. The project also contains example codes for the SYN6288, a Chinese text-to-speech module.

See also

SYN6288 Chinese Speech Synthesis Module
LD3320 Chinese Speech Recognition and MP3 Player Module

↧

Tektronix 1230 Logic Analyzer

October 18, 2014, 7:26 pm

≫ Next: Fixing 'Search fields undefined' error when generating source code for a Scriptcase grid view

≪ Previous: English text to speech on a PIC microcontroller

Made in the 1980s, the Tektronix 1230 is a general purpose logic analyzer that supports a maximum of 64 channels with up to 2048 bytes of memory per channel. Despite being huge and heavy compared to today's tiny and portable equivalents (such as the Saleae USB logic analyzer), the 1230 certainly still has its place nowadays, for example to debug older 8-bit designs such as Z80 systems, or simply as an educational tool in a digital electronics class.

I got mine from eBay, still in good condition after all these years. The CRT is working well and bright, with no burned-in marks that are typical of old CRTs:

The device comes with a Centronics parallel port and a DB25 RS232 serial port at the back:

The parallel port supports printing to certain Epson-compatible printer models manufactured in the 1980s. The DB25 (not DB9 like most serial ports found on modern devices) serial port is for communication with the PC using a proprietary MS-DOS application, which is nowhere to be found nowadays. The pinout of the serial port can be found in the notes page of the serial port settings:

Probes

The device has sockets to connect up to 4 probes, for a maximum of 64 input channels. Tektronix P6444/P6443 probes are supported. Both types of probes are almost identical, with P6444 being active whereas P6443 is a passive probe. My unit did not come with any probes so I had to purchase a P6444 probe from eBay:

The probe has the following control pins: EXT, CLK 1, CLK 2, QUAL as well as input pins D0-D15 for channels 0 to 15. The CLK pins are only needed if the logic analyzer is configured to use a synchronous clock, in which case CLK 1/CLK 2 will decide when the logic analyzer begins to capture signal samples. Whether the trigger is done on a rising edge or a falling edge is decided by the CLK 1/CLK 2 DIP switches in the centre of the probe box.

The QUAL pin is for signal qualification (enabled via the QUAL OFF/QUAL ON DIP switches). Its operation is described in the manual of the Tektronix 1240, a later but similar model:

I leave it as an exercise for the reader to experiment with the qualifier settings and understand how they actually work after reading this article.

Main menu

The unit boots up to the main menu, divided into 3 different categories: Setup, Data and Utility:

The Utility menu group contains device time and parallel/serial port settings. It also provides options to save the current setup to be restored later. Important settings that control the data acquisition behaviour are found in the Setup and Data menu groups.

Although the time settings allow years between 1900-2099, the year would jump back to 1914 even if 2014 is selected after a reboot. Some sort of Y2K issues, I believe.

Pressing the NOTES key on any screen will show the instruction text for that screen. To print a screenshot of the current screen, double pressing the NOTES key. Pressing the D key while in the Printer Port menu will print the contents of the currently active memory bank.

Timebase configuration

The Timebase menu allows you to set the type of timebase for each probe (synchronous/asynchronous), the rate of sampling (for asynchronous timebase), and the voltage threshold for low/high signals. The default threshold is 1.4V, which means that any signal above 1.4V will be considered as logic high. With this setting, the logic analyzer supports both TTL and CMOS signals. In my case, as only a single P6444 16-channel probe is connected, the timebase menu only contains settings for probe A:

Channel group configuration

The Channel Groups menu allows you to configure the the grouping of different input channels:

The interface is not user-friendly at all here, but that is typical for a machine of this era, isn't it? The display shows several channel groups (GPA, GPB, GPC, etc.), with each channel supporting binary (BIN), octal (OCT) or hexadecimal (HEX) radix. The channel definition strings have several lines showing which channels in which probes belong to the specified channel groups. The first line is the probe name (A, B, C or D) and the next 2 lines are the channel number (00 to 15). For example, in the above screenshot, channel group GPA is in binary format, uses timebase T1 with positive polarity and contains channels 00 to 15 in probe A.

Trigger configuration

The Trigger menu defines the conditions of the input signal which, if met, will cause the logic analyzer to start capturing samples:

The above display means: if value A occurs 1 times, start capturing the data and fill the sample memory. Moving the cursor to the Condition ("A") field allows you to configure how the value is evaluated:

This is perhaps the most complicated screen in this logic analyzer. Further information is available in the device's help page for the screen.

Data acquisition configuration

The logic analyzer has 4 memory banks, each holding up to 2048 data points. It has two display modes for captured data: timing and state. In timing mode, signal levels (low/high) are displayed. In state mode, values of 0 or 1 as captured, or if configured, their hexadecimal, octal or ASCII equivalents, are displayed.

The Run Control menu allows you to configure how the input data will be captured and displayed, such as which memory bank (1-4) to be used for sample storage and the default display mode to be shown after the signal has been captured.

The Mem Select menu allows you to select the active memory bank currently in use. It also shows a summary of the current timebase settings:

Timing and state diagram

After setting the necessary configurations, press the START button to start capturing the input signals. The logic analyzer will proceed to wait for the trigger conditions to be met. To stop waiting, press the STOP button.

Once the trigger conditions are met, the device will start to capture the signals until its memory is full and show the signal timing diagram (or the state diagram if configured in the Run Control menu):

You can scroll between the captured samples using the arrow keys, or zoom in or out by pressing F, followed by 4 or 5 to change the resolution. The following shows the timing diagram when zoomed out:

Below is the state diagram of the captured signal, when viewed in binary mode:

The radix can be changed to octal or hexadecimal by pressing 2:

ASCII data capture

Interestingly, the radix of the state diagram can also be changed to ASCII. To test this, I wrote a PIC program to output all characters of the ASCII string "Hello World" to PORTB of a PIC, with sufficient delay after each character. I then connected the probe channels to the output pins (RB0-RB7) and captured the output data. The following is the result when asynchronous timebase is used for capturing:

Although characters such as 'o', 'd', 'H', 'r', which apparently come from the original "Hello world" string, can be seen, they are not in order, with some characters appearing more than once. This is explained by the fact that the clock is asynchronous and different from the rate at which the output at PORTB is changed, resulting in wrongly sampled data.

To fix this, I used another pin on the PIC and toggled it whenever the output value on PORTB is changed to a different character. I then connected this pin to the CLK pin on the probe, and set the timebase to synchronous. After capturing the signal again, this is the ASCII capture when this configuration is used:

Here "SP" stands for "space" (ASCII code 32). "Hello world" can now be seen clearly in the output, with characters in order and not repeated.

Capturing narrow pulses

Out of curiosity, I decided to test how fast a signal this logic analyzer can capture. This can be done by writing a PIC program to toggle an output pin at a fast rate, and trying to capture that signal. In my tests, the shortest pulse that the logic analyzer can capture is around 80ns:

This is the corresponding display of the same signal on a Rigol DS1052E oscilloscope:

With these tests, I guess the highest signal frequency that the 1230 can reliably work with is around 10-15MHz. Faster signals may not be captured properly due to slow sampling rates and lack of available memory.

Interestingly, although the rate of the asynchronous clock can be set to 10nS, only half the usual channel memory will be available in this configuration, causing the channel groups and trigger conditions to be automatically modified to exclude channels that are unavailable. Fortunately, the 1230 will prompt you about this if 10nS is selected:

Add-on cards

The 1230 can also act as a digitizing oscilloscope and show the actual signal waveform with an appropriate add-on card. The following is the screen output when such a module is installed:

With the appropriate add-on cards installed, the 1230 can also disassemble instructions for the Z80/8085/68K processsors or decode the RS232 protocol using the Disassembly menu.

Unfortunately my unit does not come with any add-on cards and none of these cards can be found on eBay nowadays. Therefore, selecting the Disassembly menu will just display an error message saying "Disassembly requires personality module".

Data printout

Not surprisingly, getting this logic analyzer to print its screenshot or memory contents is a challenge nowadays, as the only supported printing method is via an Epson-compatible printer through a parallel port, which has disappeared from most desktop computers ever since the introduction of USB. To workaround this, I have developed a tool which uses a PIC24 to emulate a parallel port printer and stores the printout onto an SD card. The printout can later be converted to a bitmap image (.BMP) by using a Windows program.

This is the completed tool when assembled on a stripboard using a ST7735 LCD to display output messages:

See this article for the full source code and other details about the tool.

Most of the screenshots from the logic analyzer in this article were captured using the above mentioned tool. Using the same tool, you can also capture the device memory contents by pressing the D key while in the Printer Port menu. The output looks like below:

Memory  | Range is 0000 to 1023 | Timebase 1 | sync  10 uS

Loc   GPA
      bin

0000  10001000
0001  10001000
0002  01110111
0003  01110111
0004  01110111 
0005  01110111
0006  10001000

The 1230 prints its screenshots as graphics but prints its memory as text. In text mode, Epson escape codes are used to support simple text formatting (e.g. bold). The Windows software I developed can only convert the graphics output to a BMP file. For the memory printout, you can simply read the output file directly using any text editor - most will remove the escape codes (ASCII code < 32) automatically.

Composite video output

There is a BNC socket, marked as "Video Out", at the back of the logic analyzer. To test the video output, I salvaged a BNC connector from an old oscilloscope probe and made a BNC to RCA adapter:

This is the video signal shown on my oscilloscope:

The signal clearly resembles a monochrome composite PAL signal, albeit with a high peak-to-peak voltage (2.5V). It displays well on my old CRT TV:

And on my 21" LCD monitor, with the help of a composite-to-VGA converter:

There are some distortions in the video display, with the bottom and top of the display cut off. This may be due to noises in the video cable or limitations of the video output capabilities.

Probe teardown

After testing the overall functionality of the logic analyzer, I decided to perform a teardown of the probe to see its internal components. This is the front and the back of the probe's circuit board:

Apart from some Tek proprietary components such as TEK 165 2304 01, there are also quite a few 74-series ICs and some MC10H350P PECL to TTL translators. Except for the processing unit in the center of the board, no other ICs are socketed, making it hard to repair in cases of issues.

Other information

The only useful information I found of this logic analyzer on the Internet is from an old brochure, downloadable from here. It contains basic technical specifications of the 1230 and some information on the different types of supported add-on cards.

The following Youtube videos, probably converted from the original VHS training tapes made by Tektronix, are also useful:
Tektronix 1230 training (part 1)
Tektronix 1230 training (part 2)

See also my previous article on emulating a parallel port printer to capture the print output from this logic analyzer (and other similar equipments):
Capturing data from a Tektronix 1230 logic analyzer by emulating a parallel port printer

↧

Fixing 'Search fields undefined' error when generating source code for a Scriptcase grid view

November 18, 2014, 10:08 pm

≫ Next: Error 'Your layout should make use of the available space on tablets' when uploading APK to Google Play

≪ Previous: Tektronix 1230 Logic Analyzer

When using Scriptcase to quickly develop a web portal in PHP for administrators to perform CRUD (create/read/update/delete) operations on more than 20 tables in an existing database, I encountered the following error during source code generation for a Scriptcase grid:

The error occurred after I made some adjustments to the grid SQL query while switching between various options in the grid settings and tried to regenerate the code. The details of the error (Search fields undefined) were not shown until I clicked on the folder icon to the right of the Status: Error text.

Suspecting some SQL query issues, I checked the grid settings but the correct SQL query was entered in Grid>SQL menu:

The Search module of the grid was also enabled inside the Grid Modules settings. [The error message would disappear and the code generation would succeed if the Search module was disabled for the grid - however, no search capability would be available in this case]

In other words, there seemed to be no problems with the grid.

So what is the issue? A Google search on the error message returned this thread as the only result containing a hint The solution: grid_customers...Left:Search...Fields Positioning...middle:the 'valami' push right !!!. This was unfortunately too vague. Or perhaps it was meant for an older version of Scriptcase. Where exactly is the Fields Positioning option and what was the cause of the error message in the first place?

After several more hours of trial and errors I found the solution. Apparently for every Scriptcase grid, several sets of fields to be shown in list view mode (from the Grid > Edit Fields menu), in record detail view mode (from the Grid > Details > Setting menu), and in search mode (from the Search > Advanced Search/Quick Search/Dynamic Search > Select Fields menu) need to be defined. Although these fields are usually auto-generated, a quick check revealed that the search field configuration for this grid was indeed empty:

I added the search fields by pressing the >> button to configure all the existing fields for searching:

The code generation was now successful:

So the solution is to simply go to the grid search settings and re-configure the fields to be searched. Another few hours of my development time has just been wasted on a trivial issue ....

But why would the search field list suddenly become empty for this grid? I guess it is because Scriptcase always tries to re-populate the display/search fields in the grid settings when the SQL query changes. Once errors are detected in the SQL query, the display fields will not be populated and will be filled with some default values while the search field list will be emptied. If these errors are later corrected, the display fields will be populated again with the correct entries but the search field list still remains empty, causing the error Search fields undefined. This may or may not be Scriptcase bug, but in any case, the error message is not helpful at all here.

This is just one of the many scenarios where I wasted my time on understanding certain behaviour of Scriptcase, or trying to locate certain settings. Although I have to agree that Scriptcase has increased my PHP development efficiency by orders of magnitude, the lack of documentation and other usability issues still frustrate me at times.

I would like to end this post with an announcement for frequent readers of my blog. MD's Technical Sharing is now also known as The Tough Developer's Blog, available at the dedicated domain name toughdev.com. This is in preparation for more exciting changes ahead. Stay tuned for my upcoming articles with more useful tips, tricks and knowledge sharing!

↧

Error 'Your layout should make use of the available space on tablets' when uploading APK to Google Play

December 26, 2014, 4:00 am

≫ Next: Keyboard issues in GRUB bootloader on a Mac Mini booting Mac OS, Windows and Ubuntu Linux

≪ Previous: Fixing 'Search fields undefined' error when generating source code for a Scriptcase grid view

Recently I received feedback from some customers stating they could not find my application on Google Play when searching from their Android tablets. The app, however, could be found on Google Play if searched from an Android phone. Interestingly, the APK that was used to upload the same application to Google Play could install and run on the customers' tablets without issues.

I logged on to my Google Play developer console and immediately noticed an advisory in the screenshot section of the application:

Your APK does not seem to be designed for tablets

This is in spite of the fact that I have already uploaded tablet screenshots taken from another tablet for the app entry on Google Play. However, it turned out that simply uploading tablet screenshots is not enough as Google has a set of guidelines, available here, that developers should follow to make their application tablet-ready.

For those rushing to make their application available to tablet users from Google Play, the bad news is that it is not just a simple tweak on the developer console. You would actually need to modify AndroidManifest.xml to indicate tablet compatibilities and reupload the APK. The good news is that not all 12 criteria listed by Google on the Tablet App Quality are actually required for the app to show up on Google Play as a tablet app. In fact, during my testing, only the following are needed, at a minimum:

Target Android Versions Properly - by setting correct values for targetSdkVersion and minSdkVersion
Declare Hardware Feature Dependencies Properly - by setting appropriate values for uses-feature and uses-permission elements
Declare Support for Tablet Screens - by setting correct values for supports-screens element
Showcase Your Tablet UI in Google Play - simply by uploading at least two tablet screenshots, one for 7-inch devices and one for 10-inch devices

With the above changes, the error message when uploading my APK now changed to:

Your layout should make use of the available space on 7-inch tablets
Your layout should make use of the available space on 10-inch tablets

Unfortunately Google Play did not provide much useful information for these errors:

A search on Google for these errors returned no conclusive results. Some replies suspected that Google Play analyzes the APK looking for design elements specific to tablets (e.g. looking for layout folder with names layout-sw600dp, layout-sw600dp-land, layout-sw720dp, layout-sw720dp-lan, etc. or looking for an XML layout with large screen width) while others say that Google Play is simply analyzing the screenshots I uploaded to see if it looked like a tablet app, not a phone app running on tablet with huge unused white space lying around.

Well, if it's indeed analyzing the screenshot, is there a way to make it think that my screenshots are tablet-compliant? The answer is, surprisingly, to use the Device Art Generator from Google itself and drag your phone app screenshot to the tool, selecting the Nexus 9 which has tablet resolution:

This is the generated image, with the device skin overlayed on top of the original screenshot from a simple Hello World application:

Surprisingly, Google Play accepted this screenshot as tablet-compliant and finally decided to make my app available on tablets!

So I guess the conclusion is that Google simply analyzes the tablet screenshots and looks for white space, most likely from the bottom and perhaps from the right and complains that the app is not tablet-compliant if there is too much white space. This assumes that a properly designed tablet app should make full use of the screen space and expand all the way to the bottom of the screen. By using the Device Art Generator, we have satisfied this criteria by adding the device skin around the screenshot and make Google think that screen space is fully utilized!

While I do not support anyone using this trick on production apps, the Device Art Generator tool is good as a quick fix for developers to make their existing phone-only apps on Google Play available on tablets without the hassle of re-designing the existing app layout files.

↧

Keyboard issues in GRUB bootloader on a Mac Mini booting Mac OS, Windows and Ubuntu Linux

January 4, 2015, 7:36 am

≫ Next: IPDGen - Blackberry IPD backup generator from SMS CSV files

≪ Previous: Error 'Your layout should make use of the available space on tablets' when uploading APK to Google Play

The Mac Mini, my main machine for daily work, has the following partition configuration for triple-booting Windows, Mac and Ubuntu Linux:

Partition 1: Mac OS X (HFS+)
Partition 2: Windows 8 (NTFS)
Partition 3: Ubuntu Linux (Ext4)
Partition 4: DATA (NTFS)

rEFIt is used as the boot manager to allow me to select which partition to boot from at startup. GRUB2 is installed on partition #2 and configured to select between Windows 8 and Linux. This configuration has been working well for a few years.

However, recently after the old USB keyboard (a Microsoft Wired Keyboard 600) failed and had to be replaced with a Prolink PKCS-1002 keyboard, I could no longer select between Windows and Linux at the GRUB2 boot menu, and the system booted to Windows by default. The selection of the Mac OS X partition from the rEFIt menu still worked fine. Once booted to Mac OS, Windows or Linux (by changing the GRUB default entry), I could use the keyboard without hassle. The keyboard issue would still remain even when the Windows 7 BCD bootloader was used, suggesting that the issue was not specific to the GRUB bootloader.

You would probably tell me to go to BIOS and enable USB legacy support, but hey, this is a Mac that uses EFI and boots Windows via BIOS emulation, which most likely would already have legacy USB support, otherwise the old keyboard could not have worked.

Adding keyboard support to GRUB menu

After some research, I decided to follow the advice in this forum thread, which basically told me to add the following lines to /etc/default/grub:

GRUB_PRELOAD_MODULES="usb usb_keyboard ehci ohci uhci"
GRUB_TERMINAL_INPUT="usb_keyboard"

and run:

grub-mkconfig -o /boot/grub/grub.cfg
update-grub2

Well, I tried that, which turned out to be a big mistake. The USB keyboard now indeed worked fine in the GRUB menu but selecting any entry would only return error grub error: disk (hd0,msdos5) not found. A simple ls in the GRUB rescue console resulted in the same error. I guess the preloading of keyboard modules at the GRUB menu disrupted the initialization of other system driver packages and the system failed to recognize the hard disk partition to boot from.

I stupidly did not backup my grub.cfg file and the only recourse was to boot from a Ubuntu Live CD, revert the above change to /etc/default/grub and follow this guide to restore the GRUB default configuration. Fortunately this worked and I was back to square one, with a non-working keyboard at GRUB menu.

Keyboard compatibilities

At this point I decided to buy another keyboard, a Logitech K120, and see if the same issues still persist. Surprisingly everything worked and I was able to use the new keyboard to select either Windows or Linux to boot to.

So what is the issue causing only the Prolink keyboard not to work? I checked the hardware ID of all three keyboards from Windows Device Manager:

Logitech K120: VID_046D&PID_C31C [working at GRUB menu]
Microsoft Wire Keyboard 600: VID_045E&PID_0750 [working at GRUB menu]
Prolink PKCS-1002: VID_1A2C&PID_0027 [not working at GRUB menu]

All 3 keyboards are recognized as HID Keyboard Device by Windows:

Despite much effort, I could not find anything from the Device Properties page of the Prolink keyboard that could provide any hint why it could not work. I can only hazard a guess that its implementation of USB Human Interface Device is flawed causing it to fail to work with the emulated BIOS at the GRUB menu while Windows, which presumably has more sophisticated error handling, is able to detect the keyboard without issues.

↧

IPDGen - Blackberry IPD backup generator from SMS CSV files

February 1, 2015, 8:50 am

≫ Next: Comparison of popular PDF libraries on iOS and Android

≪ Previous: Keyboard issues in GRUB bootloader on a Mac Mini booting Mac OS, Windows and Ubuntu Linux

In view of the popularity of CSV2IPD, a utility which I developed back in 2010 that reads text messages from CSV files and generates an IPD file that can be imported to Blackberry devices, I have decided to put in some efforts to to further improve CSV2IPD and release its next version, known as IPDGen- short for IPD Generator.

Similar to CSV2IPD, IPDGen also accepts CSV files as input and will generate an IPD file containing the text messages. It however has the ability to auto-detect the CSV file format and identify the columns containing the message text, phone number and timestamp, thus reducing the need to manually format the CSV files, making it easier to use.

IPDGen will require .NET Framework 3.5 to run properly. The framework will be automatically installed on Windows 7 and above if it is not already installed. If you are using Windows XP, you can download it here.

This is the main user interface of IPDGen:

Improvements

The following features have been implemented:

Option to indicate whether incoming and outgoing text messages are in a single file, or multiple files.
Options to configure various CSV settings such as delimiter character, offset row and text encoding
Options to specify columns storing the message properties

With IPDGen, you can just click Browse to select the CSV files, and click Detect Settings to have it detect the CSV format for you:

The message text, phone number and timestamp columns can be detected automatically as seen from the above the screenshot. You can then just click Convert to save the messages to an IPD file. Once done, IPDGen will report the results:

You can now preview the generated IPD file to check if the messages have been processed correctly. This is a major improvement from CSV2IPD where only the total number of imported messages is reported.

More information

You can find out more information on IPDGen at:

IPDGen home page
Support forum
Knowledge base

Unfortunately due to development costs, IPDGen is not free. A trial version, which can convert up to 25 messsages, can be downloaded for users who want to experiment with the application features. You should purchase a license in order to fully convert your text messages.

The original version, CSV2IPD, will remain free and continues to be supported for those who do not need the advanced features of IPDGen.

↧

Comparison of popular PDF libraries on iOS and Android

February 13, 2015, 11:11 pm

≫ Next: Restoring text from PDF files encoded using custom CID fonts

≪ Previous: IPDGen - Blackberry IPD backup generator from SMS CSV files

Being able to preview as well as editing PDF documents is a very common requirement in many Android or iOS applications such as eBook readers or PDF form filling utilities. This article shares various PDF libraries which I have experimented, for the benefits of those who are developing PDF utilities for iOS, Android as well as other platforms.

Handling PDF using built-in libraries

To preview PDF documents on iOS, you can just use UIWebView. Much like the default browser, UIWebView supports many common file types such as PDF, Microsoft Office and multimedia documents (refer to this article for the full list). Just feed the PDF link (which can be either local or remote) to a UIWebView and your PDF will be rendered nicely:

Displaying PDF in a UIWebView, however, has some issues. The default page navigation (vertical scroll) will become inconvenient if the PDF has too many pages. It also does not offer search or bookmarking capabilities, and most importantly, does not render PDF form fields or annotations well.

The iOS SDK also comes with some Quartz 2D methods that can do basic parsing of PDF documents (via the CGPDFDocument object) and retrieves basic properties by converting it into NSDictionary objects. Any advanced manipulation would require the use of a dedicated PDF SDK.

On Android, PDF creation and previewing is only natively supported starting from Android 4.4 (API Level 19) via the android.graphics.pdf package. On earlier versions, you'll need to use one of the PDF libraries described in the next few sections instead. The default WebView component only renders web pages and does not support PDF, office or multimedia documents, unlike iOS.

Radaee PDF SDK

Formerly known as PDFViewer SDK, Radaee PDF is a paid PDF rendering/editing engine that supports both Android and iOS, as well as Windows 8/RT and Windows Phone.

It has three different versions, standard, professional and premium and costs as low as $489 for a single application license. The standard edition offers PDF viewing functionalities, the professional edition provide annotation capabilities and the premium edition offers PDF form editing features as well as encryption support. For most people who just need some improvements over basic PDF features offered by UIWebView, the Standard edition, which supports displaying of PDF annotations and form fields, should suffice. A trial version is available for users to test out the SDK.

The Radaee PDF sample apps on both Google Play and iTunes demonstrate some of the SDK capabilities. This is the home screen of the Android demo app:

The SDK methods for Android and iOS are similiar, C-style and quite easy to use. There are quite a few methods to manipulate PDF annotation, and form fields, such as:

PDF_ANNOT Page_getAnnot(PDF_PAGE page, int index)
bool Page_addAnnotInk2(PDF_PAGE page, PDF_INK hand)
bool Page_setAnnotEditText(PDF_PAGE page, PDF_ANNOT annot, const char *text)
int Page_getAnnotFieldName(PDF_PAGE page, PDF_ANNOT annot, char *buf, int buf_size)

With these methods and some knowledge of PDF file format, one should be able to manipulate PDF forms programmatically with no issues.

PDFNet SDK

Made by PDFTron, this is a multi-platform PDF engine that supports iOS, Android, Windows Phone and Xamarin. A desktop versions for .NET, Java, C/C++ and Pythons are available too. The company also offers a HTML to PDF conversion library and a WebViewer that supports previewing of a wide variety of file formats.

The sample iOS and Android applications of PDFTron, which come with the SDK, support PDF annotations such as highlighting, drawing shapes, writing text and the like:

The SDK also supports for rendering and filling PDF forms. The SDK API methods also provide ways to extract data from PDF forms as well as editing the PDF files, both on Android (in Java) and iOS (in Objective-C).

Same as Radaee PDF, PDFNet is not free, which is understandable. However what I do not like about this SDK is that the website does not specify a fixed price for the SDK and request users to ask for quotation. This causes the final price to be open to interpretation and may vary wildly depending on other seemingly unrelated factors. There is also no trial version immediately available for download from the website.

Foxit Embedded PDF SDK

This SDK is made by Foxit, the company that creates Foxit Reader, often used as an alternative to Adobe PDF Reader. It supports iOS, Android, Windows 8/RT/Phone and Linux.

In addition to supporting PDF form filling and annotations, Foxit Embedded SDK also provide text-reflow feature, which makes PDF file easier for mobile viewing. It also has an add-on module that provides integration with Digital Rights Management (DRM) solutions, which may be helpful for those working on secuity-related projects.

In my case, I download the 30-day trial version from the website to test its various capabilities on iOS. Although the C-like methods seem a bit cryptic at first, the SDK comes with a sample app which makes thing easy to understand.

As for price, Foxit PDF SDK does not seem to have a fixed price listed on its website and only allows users to request for a quotation. Again, this could result in unexpectedly high quoted price, if you are not careful in answering the questions in the quotation form.

PSPDFKit

This is a nice PDF engine for both iOS and Android. It supports both PDF form filling and annotations. There is a trial version available for download on its website, although users would need to contact sales to ask for a quotation in order to purchase.

More information can be viewed here for iOS and here for Android.

PlugPDF

This library supports both Android and iOS and costs 699 USD for a single app license. It seems to support annotations - however, no form editing support is mentioned on the website.

A 30-day trial version is available for download.

Debenu Quick PDF Library

This is a multi-platform PDF librarythat seems to support PDF creation, rendering and editing. PDF form fillingand annotations are also supported.The SDK comes with example applications that can run on Windows, Mac and iOS.

A single developer license is $499 and a trial version is available for download from its website.

Adobe Reader Mobile SDK

This SDK, provided by Adobe, supports Mac OS, Win32, iOS and Android and allows user to preview PDF and other popular ebook format such as EPUB nicely on their devices. Although it seems like an interesting library, its website provides no downloads or additional information on the SDK, except for an email address to contact the sales department.

Adobe also provides a desktop-only library, Adobe PDF Library SDK with extensive PDF integration capabilities.

PDF libraries that support only iOS

A few PDF libraries that only support iOS, but not other platformsare listed below for your reference:

FastPDFKit

This is an alternative to PSPDFKit and comes in 4 different versions: Free, Basic, Plus and Extra. The Free edition comes with all the capabilities but will show a predefined splash screen in every application that uses the SDK. The Basic version, available at 990 EUR, only offers basic viewing support. The Plus version (1990 EUR) provides text search and extraction capabilities but no multimedia support. For all the available features, you would need to purchase the Extra version at 2990 EUR.

FastPDFKit does not seem to support forms or annotations. The prices are therefore a bit too high for a basic PDF viewer library that can only search and extract text from PDF files.

PDFTouch SDK

This SDK comes in 3 edition, Single App ($1999), Developer ($6999) and Enterprise ($9999). The Single App license can only be used on a single application whereas the others can be used to build multiple applications and include the source code. The Enterprise license does not required license key activation.

PDFTouch can only render annotations but does not support PDF form filling.

ILPDFKit

This free PDF library support previewing of PDF forms and provides some simple methods for form fields data extraction. Annotations are not supported, however.

PDF libraries that support only Android

Below are some Android-only PDF libraries:

qPDF Toolkit

Supporting only Android, this SDK supports PDF rendering as well as form filling and data extraction. It also supports creating PDF, exporting pages to images, editing document properties, document encryption and other advanced features.

A trial version which will embed a watermark on the PDF documents can be downloaded from its website. To purchase, users would need to ask for a quotation.

Aspose.Pdf for Android

This SDK supports creating, previewing and editing PDF documents with advanced features like form filling, annotations and adding text/graphics elements. The cheapest license, Standard Developer Small Business, is $799 and a trial version is available for download from the website.

iTextCore

This SDK supports Android, .NET and Java and allows creating of PDF forms, form filling as well as annotation.

Interestingly, iTextCore provides an Affero General Public License (AGPL) license which allows you to use the library for free as long as you disclose the source code of your application. For other closed-source project, you will need to contact them to ask for a quotation.

The latest library source code of iTextCore can be found on GitHub.

Other open-source PDF libraries

The following are some open-source PDF engines thatcan be ported to various platforms including mobile devices, but do not officially have specific iOS or Android support:

PDFium

This is an open source PDF rendering engine that supports viewing, printing and filling PDF forms. The code is part of Chrome PDF Engine, written by Foxit, the same company that developed the Foxit PDF Embedded SDK, and forms part of the Google Chrome PDF plugin.

More information can be found on this blog post. Whether part of this code is used in the commercial PDF SDKs by Foxit, or whether it can be compiled to target mobile platforms, remains to be seen as nobody has attempted yet.

PoDoFo

PoDoFo is a free portable C++ library which includes classes to parse a PDF file and modify its contents into memory. It also includes very simple classes to create PDF files.

An iOS port is available on GitHub.

libHaru

This is a tiny open source library written in C/C++ whichallows creating of simply PDF files of text and graphics.It cannot read PDF files or create files with forms or annotations.

This is an example of a PDF document created by libHaru:

An iPhone port that comes with the sample app can be found here.This is an Android port which uses Java Native Interface (JNI) via the Android NDK.

The verdict: which one is the best?

Having personally explored in depth PDFNet SDK, Foxit Embedded SDK and Radaee PDF SDK while experimenting with other libraries in an effort to find out the most suitable library for my previous projects, I would say Radaee PDF SDK, available for both iOS and Android, is the best choice for those who want the ability to render PDF with support for form filling and annotations.
PDFNet SDK offers some advanced API methods for PDF manipulation but the non-transparent pricing is something which I would rather avoid

Other free alternatives like libHaru and ILPDFKit may also be useful for projects which do not require full PDF capabilities and only have very specific needs for certain functionalities, such as PDF creation or form field extraction. If your project targets Android and is open-source, iTextCore would make an excellent choice.

One final comment I would like to make is regarding the license cost of these PDF libraries. Most companies either ask for very high prices or require a quotation from their sales staff. Very few offer clear affordable prices targeting single developers or small companies who just want to develop a few apps that make use of PDF-specific features. Why? I guess it could be partially due to the lack of native PDF support on mobile platforms that results in more development efforts and higher SDK costs. There could possibly be more reasons, which I will leave as an exercise for the readers to find out.

↧

Restoring text from PDF files encoded using custom CID fonts

February 26, 2015, 4:54 am

≫ Next: Archiving iOS projects from command line using xcodebuild and xcrun

≪ Previous: Comparison of popular PDF libraries on iOS and Android

Recently while reading Dune: The Battle of Corrin, a 2004 science fiction novel, from a PDF e-book using Foxit Reader, I realized that I could not seem to copy text from the document. Only garbled characters, and not readable text, will be stored on the clipboard. For this reason, I also could not search the document to find specific words or phrases.

What is the issue here? For one thing, the PDF is not copy/print restricted as otherwise I would not have been able to copy text at all. Such restrictions also can be easily removed using several PDF unlocker utilities, or by using a reader such as Evince which does not honor these permission settings and simply allow user to perform all operations.

In my case, the Copy option clearly shows upon selected text being right-clicked:

By switching to Text Viewer mode in Foxit Reader, I could see that all the text in this PDF file has perhaps been encoded, or at least stored using a custom character set, as no readable text can be found:

At this point, I decided to further investigate the PDF file (download an extract from here) using PDFStreamDumper and see what character sets are embedded. Not surprisingly, PDFStreamDumper was also not able to retrieve any readable text and just showed garbled characters:

By navigating through the streams in the PDF file, I was able to locate one that seems to be responsible for the custom character encoding:

It seems as if the PDF was not generated using standard character encoding such as Unicode or ANSI. Instead, the author has decided to use CID fonts, Adobe's custom font and character set format, to store the document text. While using CID fonts can have many advantages, especially when displaying Eastern language text, in this case I believe it was purely a deliberate attempt to make copying text from the document a hassle, as most reader applications would just copy the original encoded characters, resulting in garbled text being pasted.

Restoring the original text

So how would you go about copying text from such a document? The first obvious way is to study the CID fonts being used (read here for some hints), compare the decoded text and the characters being stored, reverse engineer the mapping rules and write a program to restore the original text from the PDF file based on the rules. Not an easy task for the average computer user, I assume.

Another possible approach is to use a window text capture tool such as TextCatch which tries to grab the text from the selected window as it is displayed. However, for most PDF files which I tested, even for those without special character encoding, TextCatch does not seen to be able to retrieve any text from the PDF viewer window.

Is there an easier way? How about performing optical character recognition on the PDF file? In Adobe Acrobat, this can be performed via View> Tools> Text Recognition menu:

And this is the result:

Nope, it didn't work because the stored text, although encoded, still appears renderable to Adobe Acrobat, which refuses to work unless the page is an image that can be OCR'ed.

I tried again by using Foxit Reader PDF printer to print the document into a another PDF file. This way, the resulting file has each page stored as an image and the custom character encoding removed. As a result, Adobe Acrobat OCR now worked properly:

If Adobe Acrobat still says your file contains renderable text, print the printed document as another PDF file and try again, which will ensure that any left-over text would also be converted to graphics.

After saving the resulting file and opening it with Foxit Reader, I could see that the document now contained readable text:

Text inside the PDF can now be searched without issues:

For some reasons the characters in the final PDF file, after the OCR process, seem to be thicker compared with the original, but this should not be an issue for most people. There could also be some misspelled words as a result of the optical character recognition algorithm. However, overall the quality is satisfactory and most text inside the original document has been restored to facilitate copying and searching. The final document that contains searchable text can be downloaded here.

I believe this trick to perform OCR on the PDF-printed copy of the original document will help other people with similar problems. I hope somebody with some expertise on PDF format can help me do a better job by proposing a method to restore the original text in a more faithful manner without using OCR.

↧

Archiving iOS projects from command line using xcodebuild and xcrun

June 8, 2014, 9:12 am

≪ Previous: Restoring text from PDF files encoded using custom CID fonts

In one of my recent projects, I need to provided 16 different builds from a single iOS application code base. Although the different builds targeting different customers, having different application names, server addresses and build configurations have already been configured as different schemes in xCode, having to perform these builds manually using xCode is still time consuming and error-prone.

Fortunately, cleaning, building, archiving and exporting an iOS project to IPA file can be done easily using the following xcodebuild commands (assuming xCode 5 is installed):

xcodebuild -project Reporter.xcodeproj -scheme "InternalTest" -configuration "Release Adhoc" clean

xcodebuild -archivePath "InternalTestRelease.xcarchive" -project Reporter.xcodeproj -sdk iphoneos -scheme "InternalTest" -configuration "Release Adhoc" archive

xcodebuild -exportArchive -exportFormat IPA -exportProvisioningProfile "My Release Profile" -archivePath "InternalTestRelease.xcarchive" -exportPath "InternalTestRelease.ipa"

After cleaning, the project is archived to a .xcarchive file, exported to an IPA file and signed using the given provisioning profile, ready to be distributed for internal testing.

This seems easy. However, as with xCode (or many other Apple developer tools for that matter), xcodebuild comes with bugs, and sometimes hard-to-find ones. After a few round of testing, I realized that the generated signed IPA file would sometimes fail to be installed on the device. When attempting to install the IPA file, the iPhone configuration utility says "The executable was signed with invalid entitlements" with the following detailed messages in the console log:

Admin-iPhone installd[31] : 0x2ffee000 MobileInstallationInstall_Server: Installing app com.ios.testapp
Admin-iPhone installd[31] : 0x2ffee000 verify_signer_identity: MISValidateSignatureAndCopyInfo failed for /var/tmp/install_staging.9sdyqR/TestApp.app/TestApp: 0xe8008016
Admin-iPhone installd[31] : 0x2ffee000 do_preflight_verification: Could not verify executable at /var/tmp/install_staging.9sdyqR/TestApp.app

Basically the IPA file was not signed properly and could not be installed. What made this very strange is that although all the provisioning profiles and signing identities were configured correctly, the signing issue still occurred intermittently - one attempt would produce an incorrectly signed IPA file while the next attempt would produce a correctly signed IPA file that can be installed on the device. Frustrated, I decided to investigate and found out the root cause of the issue.

First I checked if the correct provisioning profile was used to sign the IPA file by extracting the file as if it was a ZIP file and searching the .app folder for a file called embedded.mobileprovision. It was indeed the correct signing profile - even when the generated IPA file was corrupted.

Secondly, I compared the extracted files from the correctly signed IPA package and the corrupted IPA package to see the differences. The digital signatures for the signed components can be found in a file named CodeResources, an XML-formatted file located in the .app folder, and they look like below:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPEplist PUBLIC "-//Apple//DTD PLIST 1.0//EN""http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plistversion="1.0">
<dict>
<key>files</key>
<dict>
<key>32x32_arrow.png</key>
<data>
        P/XDWeYKpPpwSzLCFxSXV23inIQ=
</data>
<key>logo.jpg</key>
<data>
        L+8Od1POJVPM7BFJPofhiR2rDso=
</data>
......
</dict>
</dict>
</plist>

After comparing the CodeResources file of the correct and corrupted IPA packages, I realized that most of the digital signatures were actually the same, with the only difference being the signature for the application executable file, e.g. TestApp. I checked the MD5 hashes of these two files and they were indeed different, explaining the difference in the signatures. This however did not yet explain the reason for the corrupted package, until I checked the MD5 hash of all the extracted files from both packages and realized that, although the generated core data model files (.mom and .omo) from the two packages were different, they both had the same digital signatures in CodeSignatures! This explained why the corrupted package failed the integrity check and could not be installed on the device. To verify this theory, I tried to overwrite the .omo and .mom files in the corrupted package with those from the correct IPA package while keeping the rest of the files the same. The modified IPA file could indeed be installed and run on the device. This showed that the incorrect digital signatures were indeed the issue.

But why is there such an issue? During my testing the corruption seemed to happen more often if xcodebuild is run repeatedly from command line to build one project after another. It does not happen that often if xcodebuild is run once in a while, or with sufficient delay between executions. Sounds kind of some caching issue, causing the program to use back the old digital signatures. The exact answer, of course, is only known by Apple.

To fix this problem, we need to use xcrun, another tool with similar commands to build and export the project to an IPA file:

xcodebuild -project Reporter.xcodeproj -scheme "InternalTest" -configuration "Release Adhoc" clean

xcodebuild -project Reporter.xcodeproj -sdk iphoneos -scheme "InternalTest" -configuration "Release Adhoc"

xcrun -sdk iphoneos PackageApplication -v "Internaltest/TestApp.app" -o "InternalTestRelease.ipa" --sign \"iPhone Distribution: My Company Pte Ltd (XCDEFV)"

A minor inconvenience is that xcrun requires the exact provisioning identity, e.g iPhone Distribution: My Company Pte Ltd (XCDEFV)) and the full path to the application to be signed. The project would therefore need to be built first using xcodebuild, with the build output path specified in the "Per-configuration Build Products Path" settings and passed to xcrun via the -v parameter.

However, at least with xcrun, I encountered no signing issues after repeated testings, and the generated IPA packages can always be installed on the device successfully.

↧