Using data mining for digital ink recognition: dividing text and shapes in sketched diagrams
Version 2 2024-06-13, 09:39Version 2 2024-06-13, 09:39
Version 1 2016-03-02, 14:59Version 1 2016-03-02, 14:59
journal contribution
posted on 2024-06-13, 09:39authored byR Blagojevic, B Plimmer, J Grundy, Y Wang
The low accuracy rates of textshape dividers for digital ink diagrams are hindering their use in real world applications. While recognition of handwriting is well advanced and there have been many recognition approaches proposed for hand drawn sketches, there has been less attention on the division of text and drawing ink. Feature based recognition is a common approach for textshape division. However, the choice of features and algorithms are critical to the success of the recognition. We propose the use of data mining techniques to build more accurate textshape dividers. A comparative study is used to systematically identify the algorithms best suited for the specific problem. We have generated dividers using data mining with diagrams from three domains and a comprehensive ink feature library. The extensive evaluation on diagrams from six different domains has shown that our resulting dividers, using LADTree and LogitBoost, are significantly more accurate than three existing dividers.