About the parse evolution | |
guest2 | 6-Nov-2008/18:05:27+1:00 |
Je reposte ma dernière version (commentée) d'async-parser en espérant que ça va donner des idées à Carl et aux autres.(sorry for my frenglish) In this script a tested various techniques to improve the parse dialect. The source is a little bit tricky to read (i like compact code) So i have to explain some things... The main func is [parse-seek] and accept as input a file name and a block of parsing rules. It's an attempt to parse a stream file, only the data requested to fill the rules are loaded. When the rules complete or fail, the port file is closed. [Buffer] contain the current chunck of the file. [Getf] is the code block wich load data into the buffer when it's necessary. Now the problem is to modify the input rules to add [getf] in the requested places. To do this, i used [parse] again with a special set of rules called [meta-rules]. I took the opportunity to add 2 new commands in the parse dialect: [failed] do a skip end to abort the parsing. [geti word! integer!] set a word with an integer stored in little endian format. ex: geti var 1 -> load an unsigned 8 bits integer geti var 2 -> load an unsigned 16 bits integer geti var 3 -> load an unsigned 24 bits integer. geti var 4 -> load a signed 32 bits integer (see the tests below to see an usage) When the changes are made, the block of rules is flaged so that it can be reused by [parse-seek] without the need to reconstruct the rules. It is why the input rules are directly modified and not copied before the construct. REBOL [ file: %async-parser.r author: guest2 ] context [ &: &&: port: stop: n: none buffer: make binary! 5000 failed?: false failed: [(failed?: true ) end skip] ..: func [blk] [change/part & compose/deep blk && ] getf: [( if n > length? buffer [ append buffer copy/part at port index? tail buffer max 50 n + 1 - length? buffer] )] set 'parse-seek func [ [throw] file [file! url!] rules [block!] /binary /local result new oopen convert-string meta-rules ][ oopen: pick [open/read/seek open/read/seek/binary] not binary convert-string: pick [ [&: binary! &&: (.. [(as-string &/1)]) :&] [&: [string! | char!] &&: (.. [(as-binary form &/1)]) :&] ] not binary buffer: head buffer either 5000 < length? buffer [buffer: make binary! 5000 recycle][clear buffer] failed?: false unless rules/1 = 'constructed [ parse rules meta-rules: [ some [ convert-string | &: any-string! &&: (.. [[buffer: (to-paren compose [n: (length? &/1)]) getf (&/1)]]) :& skip | integer! 'skip &&: (.. [[buffer: (to-paren compose [n: (&/1)]) getf n skip]]) :& skip | 'skip &&: (.. [[buffer: (to-paren [n: 1]) getf skip]]) :& skip | 'get word! integer! &&: (.. [[ buffer: (to-paren compose [n: (&/3)]) getf (to-paren compose/deep [ set [(&/2)] to integer! as-binary cp/part buffer (&/3) buffer: skip buffer (&/3) ]) :buffer ]]) :& skip | 'failed &&: (.. [failed]) :& skip | 'end 'skip | ['to | 'thru] 'end | 'thru skip &&: (.. [some [(&/2) break | skip | failed]]) :& | 'to skip &&: (.. use [pos][[some [pos: (&/2) :pos break | skip | failed]]]) :& | word! &&: (if find [string! char!] type?/word get/any &/1 [.. [(to-string get &/1)] &&: &]) :&& | paren! | path! | into meta-rules | skip ] ] new: reduce ['constructed cp/deep rules] clear change rules new ] port: (do oopen file) if error? set/any 'result try [parse/all buffer rules/constructed][close port throw result] close port either failed? [false][:result] ] ] ;halt ;**** DISCARD ME ***** prin "*** TEST: is this a Rebol script ? -> " probe parse-seek/binary %async-parser.r [ "REBOL" any [" " | tab] "[" to end | thru "^/REBOL" any [" " | tab] "[" to end ] print "*** TEST: get size of a JPEG file ****" print "(note it's incomplete, it currently doesn't work with all jpg files)" if parse-seek/binary probe %IMG_8001.jpg [ #{FFD8} ; jpeg Header [ #{FFE0} ;* JFIF header geti len 2 ;* get block length (2 bytes) "JFIF" ;* yeah it's a JFIF (confirmation) (len: len - 6) len skip ;* skip this block some [ #{FFC0} ;* good ! i found the length properties block. 2 skip ;* don't need to know the length of this block. skip ;* filler ??? always = #{08} geti height 2 geti width 2 break ;* finished | #{FF} skip ;* skip the block. geti len 2 (len: len - 2) len skip | failed ;* error in the format ] | #{FFE1} ;* EXIF header geti len 2 ;* get length of the block ;* ... TO DO failed ] to end ][ ?? height ?? width ] halt | |
guest2 | 6-Nov-2008/19:08:23+1:00 |
Ca m'apprendra à faire des modifs de dernière minute. At line 71, replace: | 'get word! integer! &&: by | 'geti word! integer! &&: | |
Login required to Post. |