Subject:
Re: [ruby-ffi] FFI check & questions
From:
Jeffrey Jones
Date:
2/27/12 8:41 PM
To:
ruby-ffi@googlegroups.com

Hello all

I was adding some error checking (Checking page numbers etc) to make sure that the pdf data was being read correctly and tried converting a different PDF (one with more pages) and it worked!

So it looks like there may be some issues depending on the PDF data but I imagine this is internal to the library.

In other words, it works!

Thank you very much for the help all.

On 28/02/12 11:24, Jeffrey Jones wrote:
Hello Wayne,

Thank you for the information and advice, it was very englightening.

1) I double checked and get_pagememsize does in fact return a size_it, not an int. I copied the method signatures from the documentation but the documentation is incorrect. The header file definition is: size_t pdf2img_get_pagememsize(ImageConversion iC)

I have checked the others again and they appear to be correct.

2) Thank you for that, I had no idea that this was incorrect. Unless there is any objections I would like to add an explicit section on that to the wiki.

I have updated the gist with the latest code.

I am still getting 4095MB returned from get_pagememsize function however, but from what I can see none of the ruby code could affect that since passing in bad data from ruby earlier would raise an exception before getting to the pagememsize function.

regards

Jeff


On 28/02/12 10:13, Wayne Meissner wrote:

There are a couple of problems:

1) The return type of get_pagememsize should be :int, not :size_t - :size_t is 64bit on 64bit platforms, and it will return bogus values from a function that actually returns an int.

2) You're using string ops on binary data.  In FFI, anything that is :string, or read_string or write_string treats the string as a NUL terminated C string - i.e. it will stop reading at the first zero byte, and ensures the data written to memory is terminated with a zero byte.

So, in pdf_data_to_pointer(),
instead of MemoryPointer.from_string(), use:
    @@pointer_data = MemoryPointer.new(:char, data.size)
    @@pointer_data.put_bytes(0, data)


And in
get_image(), instead of
picBuf.read_string, use picBuf.get_bytes(0, image_size)

On Tuesday, 28 February 2012 10:43:07 UTC+10, Jeffrey Jones wrote:
Hi Matjis,

I could have sworn I already checked the header file for a definition
but I obviously missed something

The code is

typedef enum {
     EPSoutput,
     TIFFoutput,
     JPEGoutput,
     BMPoutput,
     PNGoutput,
     RAWoutput,
     PDFoutput,
     GIFoutput,
} OutputTypeCode;

So 0 would be EPS output I assume, I changed it to 2 (JPEG) out of
curiosity but the allocation is still 4Gb.

Regarding checking if picBuf is null. I added a check immediately after
the malloc and the pointer is indeed null.
If I cheat badly and malloc image_size / 1024 then the picBuf pointer is
not null but them of course the
get_pagemem method fails.

I have no idea why I am getting a return value of 4Gb from the
get_pagememsize(iC) method. I was hoping there
was some obvious mistake (type conversion or something) I was making but
if nothing obvious springs out to
more experienced eyes then I am not sure.

Regards,

Jeff

On 28/02/12 01:14, Matijs van Zuijlen wrote:
> Hi Jeffrey,
>
> On 02/27/2012 02:26 AM, Jeffrey Jones wrote:
>> I am in the process of looking at using the following library from ruby
>> (http://www.datalogics.com/products/pdf2img/) using FFI but have run
>> into a few
>> issues mainly due to my lack of FFI experience and general lack of C++.
>>
>> [...]
>>
>> The following Gist has the example C++ code (Posted with permission)
>> and my Ruby
>> code along side some example output. https://gist.github.com/1920005
>>
>> [...]
>>
>> On line 22 of the ruby file I have defined the second argument as an
>> :int, this
>> is probably wrong but it suffices for the moment I think. I assume
>> the real
>> value is an Enumeration, struct or something along those lines. It is
>> not
>> documented so I will ask the original developers.
>
> You should take a look at the file 'pdf2imglib.h', to see what the
> values of that enumeration are. In particular, what integer value
> corresponds to GIFoutput.
>
>> Questions:
>>
>> 1. On line 40 of the ruby file the required memory is apparently 4Gb,
>> this is
>> obviously wrong, does anyone know why? (I assume I have messed
>> something up
>> somewhere)
>
> Perhaps setting the output type to 0 is wrong. I'm not sure what
> output type would demand 4GB, though.
>
>> 2. Can anyone see any other obvious mistakes on my part?
>
> I'm guessing allocating 4GB is not going very well. Do check the that
> the resulting pointer isn't null, i.e., with picBuf.null?
>
> Regards,


On Tuesday, 28 February 2012 10:43:07 UTC+10, Jeffrey Jones wrote:
Hi Matjis,

I could have sworn I already checked the header file for a definition
but I obviously missed something

The code is

typedef enum {
     EPSoutput,
     TIFFoutput,
     JPEGoutput,
     BMPoutput,
     PNGoutput,
     RAWoutput,
     PDFoutput,
     GIFoutput,
} OutputTypeCode;

So 0 would be EPS output I assume, I changed it to 2 (JPEG) out of
curiosity but the allocation is still 4Gb.

Regarding checking if picBuf is null. I added a check immediately after
the malloc and the pointer is indeed null.
If I cheat badly and malloc image_size / 1024 then the picBuf pointer is
not null but them of course the
get_pagemem method fails.

I have no idea why I am getting a return value of 4Gb from the
get_pagememsize(iC) method. I was hoping there
was some obvious mistake (type conversion or something) I was making but
if nothing obvious springs out to
more experienced eyes then I am not sure.

Regards,

Jeff

On 28/02/12 01:14, Matijs van Zuijlen wrote:
> Hi Jeffrey,
>
> On 02/27/2012 02:26 AM, Jeffrey Jones wrote:
>> I am in the process of looking at using the following library from ruby
>> (http://www.datalogics.com/products/pdf2img/) using FFI but have run
>> into a few
>> issues mainly due to my lack of FFI experience and general lack of C++.
>>
>> [...]
>>
>> The following Gist has the example C++ code (Posted with permission)
>> and my Ruby
>> code along side some example output. https://gist.github.com/1920005
>>
>> [...]
>>
>> On line 22 of the ruby file I have defined the second argument as an
>> :int, this
>> is probably wrong but it suffices for the moment I think. I assume
>> the real
>> value is an Enumeration, struct or something along those lines. It is
>> not
>> documented so I will ask the original developers.
>
> You should take a look at the file 'pdf2imglib.h', to see what the
> values of that enumeration are. In particular, what integer value
> corresponds to GIFoutput.
>
>> Questions:
>>
>> 1. On line 40 of the ruby file the required memory is apparently 4Gb,
>> this is
>> obviously wrong, does anyone know why? (I assume I have messed
>> something up
>> somewhere)
>
> Perhaps setting the output type to 0 is wrong. I'm not sure what
> output type would demand 4GB, though.
>
>> 2. Can anyone see any other obvious mistakes on my part?
>
> I'm guessing allocating 4GB is not going very well. Do check the that
> the resulting pointer isn't null, i.e., with picBuf.null?
>
> Regards,


On Tuesday, 28 February 2012 10:43:07 UTC+10, Jeffrey Jones wrote:
Hi Matjis,

I could have sworn I already checked the header file for a definition
but I obviously missed something

The code is

typedef enum {
     EPSoutput,
     TIFFoutput,
     JPEGoutput,
     BMPoutput,
     PNGoutput,
     RAWoutput,
     PDFoutput,
     GIFoutput,
} OutputTypeCode;

So 0 would be EPS output I assume, I changed it to 2 (JPEG) out of
curiosity but the allocation is still 4Gb.

Regarding checking if picBuf is null. I added a check immediately after
the malloc and the pointer is indeed null.
If I cheat badly and malloc image_size / 1024 then the picBuf pointer is
not null but them of course the
get_pagemem method fails.

I have no idea why I am getting a return value of 4Gb from the
get_pagememsize(iC) method. I was hoping there
was some obvious mistake (type conversion or something) I was making but
if nothing obvious springs out to
more experienced eyes then I am not sure.

Regards,

Jeff

On 28/02/12 01:14, Matijs van Zuijlen wrote:
> Hi Jeffrey,
>
> On 02/27/2012 02:26 AM, Jeffrey Jones wrote:
>> I am in the process of looking at using the following library from ruby
>> (http://www.datalogics.com/products/pdf2img/) using FFI but have run
>> into a few
>> issues mainly due to my lack of FFI experience and general lack of C++.
>>
>> [...]
>>
>> The following Gist has the example C++ code (Posted with permission)
>> and my Ruby
>> code along side some example output. https://gist.github.com/1920005
>>
>> [...]
>>
>> On line 22 of the ruby file I have defined the second argument as an
>> :int, this
>> is probably wrong but it suffices for the moment I think. I assume
>> the real
>> value is an Enumeration, struct or something along those lines. It is
>> not
>> documented so I will ask the original developers.
>
> You should take a look at the file 'pdf2imglib.h', to see what the
> values of that enumeration are. In particular, what integer value
> corresponds to GIFoutput.
>
>> Questions:
>>
>> 1. On line 40 of the ruby file the required memory is apparently 4Gb,
>> this is
>> obviously wrong, does anyone know why? (I assume I have messed
>> something up
>> somewhere)
>
> Perhaps setting the output type to 0 is wrong. I'm not sure what
> output type would demand 4GB, though.
>
>> 2. Can anyone see any other obvious mistakes on my part?
>
> I'm guessing allocating 4GB is not going very well. Do check the that
> the resulting pointer isn't null, i.e., with picBuf.null?
>
> Regards,


On Tuesday, 28 February 2012 10:43:07 UTC+10, Jeffrey Jones wrote:
Hi Matjis,

I could have sworn I already checked the header file for a definition
but I obviously missed something

The code is

typedef enum {
     EPSoutput,
     TIFFoutput,
     JPEGoutput,
     BMPoutput,
     PNGoutput,
     RAWoutput,
     PDFoutput,
     GIFoutput,
} OutputTypeCode;

So 0 would be EPS output I assume, I changed it to 2 (JPEG) out of
curiosity but the allocation is still 4Gb.

Regarding checking if picBuf is null. I added a check immediately after
the malloc and the pointer is indeed null.
If I cheat badly and malloc image_size / 1024 then the picBuf pointer is
not null but them of course the
get_pagemem method fails.

I have no idea why I am getting a return value of 4Gb from the
get_pagememsize(iC) method. I was hoping there
was some obvious mistake (type conversion or something) I was making but
if nothing obvious springs out to
more experienced eyes then I am not sure.

Regards,

Jeff

On 28/02/12 01:14, Matijs van Zuijlen wrote:
> Hi Jeffrey,
>
> On 02/27/2012 02:26 AM, Jeffrey Jones wrote:
>> I am in the process of looking at using the following library from ruby
>> (http://www.datalogics.com/products/pdf2img/) using FFI but have run
>> into a few
>> issues mainly due to my lack of FFI experience and general lack of C++.
>>
>> [...]
>>
>> The following Gist has the example C++ code (Posted with permission)
>> and my Ruby
>> code along side some example output. https://gist.github.com/1920005
>>
>> [...]
>>
>> On line 22 of the ruby file I have defined the second argument as an
>> :int, this
>> is probably wrong but it suffices for the moment I think. I assume
>> the real
>> value is an Enumeration, struct or something along those lines. It is
>> not
>> documented so I will ask the original developers.
>
> You should take a look at the file 'pdf2imglib.h', to see what the
> values of that enumeration are. In particular, what integer value
> corresponds to GIFoutput.
>
>> Questions:
>>
>> 1. On line 40 of the ruby file the required memory is apparently 4Gb,
>> this is
>> obviously wrong, does anyone know why? (I assume I have messed
>> something up
>> somewhere)
>
> Perhaps setting the output type to 0 is wrong. I'm not sure what
> output type would demand 4GB, though.
>
>> 2. Can anyone see any other obvious mistakes on my part?
>
> I'm guessing allocating 4GB is not going very well. Do check the that
> the resulting pointer isn't null, i.e., with picBuf.null?
>
> Regards,


On Tuesday, 28 February 2012 10:43:07 UTC+10, Jeffrey Jones wrote:
Hi Matjis,

I could have sworn I already checked the header file for a definition
but I obviously missed something

The code is

typedef enum {
     EPSoutput,
     TIFFoutput,
     JPEGoutput,
     BMPoutput,
     PNGoutput,
     RAWoutput,
     PDFoutput,
     GIFoutput,
} OutputTypeCode;

So 0 would be EPS output I assume, I changed it to 2 (JPEG) out of
curiosity but the allocation is still 4Gb.

Regarding checking if picBuf is null. I added a check immediately after
the malloc and the pointer is indeed null.
If I cheat badly and malloc image_size / 1024 then the picBuf pointer is
not null but them of course the
get_pagemem method fails.

I have no idea why I am getting a return value of 4Gb from the
get_pagememsize(iC) method. I was hoping there
was some obvious mistake (type conversion or something) I was making but
if nothing obvious springs out to
more experienced eyes then I am not sure.

Regards,

Jeff

On 28/02/12 01:14, Matijs van Zuijlen wrote:
> Hi Jeffrey,
>
> On 02/27/2012 02:26 AM, Jeffrey Jones wrote:
>> I am in the process of looking at using the following library from ruby
>> (http://www.datalogics.com/products/pdf2img/) using FFI but have run
>> into a few
>> issues mainly due to my lack of FFI experience and general lack of C++.
>>
>> [...]
>>
>> The following Gist has the example C++ code (Posted with permission)
>> and my Ruby
>> code along side some example output. https://gist.github.com/1920005
>>
>> [...]
>>
>> On line 22 of the ruby file I have defined the second argument as an
>> :int, this
>> is probably wrong but it suffices for the moment I think. I assume
>> the real
>> value is an Enumeration, struct or something along those lines. It is
>> not
>> documented so I will ask the original developers.
>
> You should take a look at the file 'pdf2imglib.h', to see what the
> values of that enumeration are. In particular, what integer value
> corresponds to GIFoutput.
>
>> Questions:
>>
>> 1. On line 40 of the ruby file the required memory is apparently 4Gb,
>> this is
>> obviously wrong, does anyone know why? (I assume I have messed
>> something up
>> somewhere)
>
> Perhaps setting the output type to 0 is wrong. I'm not sure what
> output type would demand 4GB, though.
>
>> 2. Can anyone see any other obvious mistakes on my part?
>
> I'm guessing allocating 4GB is not going very well. Do check the that
> the resulting pointer isn't null, i.e., with picBuf.null?
>
> Regards,


On Tuesday, 28 February 2012 10:43:07 UTC+10, Jeffrey Jones wrote:
Hi Matjis,

I could have sworn I already checked the header file for a definition
but I obviously missed something

The code is

typedef enum {
     EPSoutput,
     TIFFoutput,
     JPEGoutput,
     BMPoutput,
     PNGoutput,
     RAWoutput,
     PDFoutput,
     GIFoutput,
} OutputTypeCode;

So 0 would be EPS output I assume, I changed it to 2 (JPEG) out of
curiosity but the allocation is still 4Gb.

Regarding checking if picBuf is null. I added a check immediately after
the malloc and the pointer is indeed null.
If I cheat badly and malloc image_size / 1024 then the picBuf pointer is
not null but them of course the
get_pagemem method fails.

I have no idea why I am getting a return value of 4Gb from the
get_pagememsize(iC) method. I was hoping there
was some obvious mistake (type conversion or something) I was making but
if nothing obvious springs out to
more experienced eyes then I am not sure.

Regards,

Jeff

On 28/02/12 01:14, Matijs van Zuijlen wrote:
> Hi Jeffrey,
>
> On 02/27/2012 02:26 AM, Jeffrey Jones wrote:
>> I am in the process of looking at using the following library from ruby
>> (http://www.datalogics.com/products/pdf2img/) using FFI but have run
>> into a few
>> issues mainly due to my lack of FFI experience and general lack of C++.
>>
>> [...]
>>
>> The following Gist has the example C++ code (Posted with permission)
>> and my Ruby
>> code along side some example output. https://gist.github.com/1920005
>>
>> [...]
>>
>> On line 22 of the ruby file I have defined the second argument as an
>> :int, this
>> is probably wrong but it suffices for the moment I think. I assume
>> the real
>> value is an Enumeration, struct or something along those lines. It is
>> not
>> documented so I will ask the original developers.
>
> You should take a look at the file 'pdf2imglib.h', to see what the
> values of that enumeration are. In particular, what integer value
> corresponds to GIFoutput.
>
>> Questions:
>>
>> 1. On line 40 of the ruby file the required memory is apparently 4Gb,
>> this is
>> obviously wrong, does anyone know why? (I assume I have messed
>> something up
>> somewhere)
>
> Perhaps setting the output type to 0 is wrong. I'm not sure what
> output type would demand 4GB, though.
>
>> 2. Can anyone see any other obvious mistakes on my part?
>
> I'm guessing allocating 4GB is not going very well. Do check the that
> the resulting pointer isn't null, i.e., with picBuf.null?
>
> Regards,