Results 1 to 6 of 6

Thread: Convert PDF to HTML

Hybrid View

  1. #1
    Join Date
    May 2010
    Manila, Philippines

    Convert PDF to HTML

    Hi guys,

    Is it possible for PHP to be able to convert a pdf file into html?

  2. #2
    Join Date
    Apr 2013
    i think it is possible , i can share with you the image converter i am currently using. it can do pdf to html conversion. i am not an expert in this field so thew only way i know is to employ an image converter. i have looked for solutions in many forums but i can't find a way which can do it by myself. so if you haven't found a good way to do that, you can have a try.

  3. #3
    Join Date
    Jul 2012
    Yes this is possible, and will not be fun to do in php.

    But here are the concepts of this:

    PDF files actually have it's own scripting language, so you can match (regex) any tags or such to html tags. I've not done this for a long time, but did this in C so I can't tell you what to look for, assuming you have the file readable form source. Now the contents of the file are compressed (it should be), so to be able to match the tags you must understand the compression process. After getting it into the source code, output a temporary txt file for you to parse (match the tags). You can start with the basics, and try to get more advance by matching more dynamic pdf scripts (I think they use ActionScript, i don't remember), you could output javascript! Not an easy task, but more fun to do it this way And perhaps their might be modules that are more efficient. If you really wanted to, you could do this in javascript in order to make the user's computer do the processing instead of your servers, but if the project got to big it might be to long for slower internet speeds, and also you would have to create a workaround for uncompressing and outputting it somewhere in the browser to parse. It could be a fun experiment trying to figure out if it does end up working to your advantage to do the js version or php version. Post results here, or also make it an open source project on github

    Eventually, you can reduce your solution to be more efficient. But for now to understand the process, do this:

    userUploadFile => uncompress pdf and output to temporary folder(in .txt) (also validate the file) => parse the file => output in the form you want AND delete temporary file

  4. #4
    Join Date
    May 2013
    I want to be able to convert a PDF file to an HTML file via PHP, but am running into some trouble.

    I found a basic way to do this using Saaspose, which lets you convert PDF's to HTML files. There are some problems with this, however, such as the use of SVGs, images, positioning, fonts, etc.

    All I would need is the ability to grab the text from the PHP file and any images associated with it, and then display it in a linear format as opposed to it being formatted with absolute positioning.

  5. #5
    Join Date
    Jun 2013
    Quote Originally Posted by iahne View Post
    Hi guys,

    Is it possible for PHP to be able to convert a pdf file into html?
    well, use XFlip pdf to flip to do this well
    well, personally speaking, PHP is possible for you to convert a pdf file to a html file, yet you could also try other tools by google search!

  6. #6
    Join Date
    Apr 2013
    Thanks for your nice sharing. I wonder whether there are any differences between the
    PDF converter I am testing these days and the one you mentioned above? Any suggestion will be appreciated. Thanks in advance.

    Best regards,

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
HTML5 Development Center



X vBulletin 4.2.2 Debug Information

  • Page Generation 0.10286 seconds
  • Memory Usage 3,048KB
  • Queries Executed 13 (?)
More Information
Template Usage (30):
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_global_above_footer
  • (1)ad_global_below_navbar
  • (1)ad_global_header1
  • (1)ad_global_header2
  • (1)ad_navbar_below
  • (1)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)headinclude_bottom
  • (6)memberaction_dropdown
  • (1)navbar
  • (4)navbar_link
  • (1)navbar_moderation
  • (1)navbar_noticebit
  • (1)navbar_tabs
  • (2)option
  • (6)postbit
  • (6)postbit_onlinestatus
  • (6)postbit_wrapper
  • (1)showthread_list
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available (6):
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files (27):
  • ./showthread.php
  • ./global.php
  • ./includes/class_bootstrap.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/functions_navigation.php
  • ./includes/class_friendly_url.php
  • ./includes/class_hook.php
  • ./includes/class_bootstrap_framework.php
  • ./vb/vb.php
  • ./vb/phrase.php
  • ./includes/functions_facebook.php
  • ./includes/functions_calendar.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_notice.php
  • ./includes/functions_threadedmode.php
  • ./packages/vbattach/attach.php
  • ./vb/types.php
  • ./vb/cache.php
  • ./vb/cache/db.php
  • ./vb/cache/observer/db.php
  • ./vb/cache/observer.php 

Hooks Called (71):
  • init_startup
  • friendlyurl_resolve_class
  • init_startup_session_setup_start
  • database_pre_fetch_array
  • database_post_fetch_array
  • init_startup_session_setup_complete
  • global_bootstrap_init_start
  • global_bootstrap_init_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • load_show_variables
  • load_forum_show_variables
  • global_state_check
  • global_bootstrap_complete
  • global_start
  • style_fetch
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • strip_bbcode
  • friendlyurl_clean_fragment
  • friendlyurl_geturl
  • forumjump
  • cache_templates
  • cache_templates_process
  • template_register_var
  • template_render_output
  • fetch_template_start
  • fetch_template_complete
  • parse_templates
  • fetch_musername
  • notices_check_start
  • notices_noticebit
  • process_templates_complete
  • friendlyurl_redirect_canonical
  • showthread_post_start
  • showthread_query_postids_threaded
  • showthread_threaded_construct_link
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • memberaction_dropdown
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • build_navigation_data
  • build_navigation_array
  • check_navigation_permission
  • process_navigation_links_start
  • process_navigation_links_complete
  • set_navigation_menu_element
  • build_navigation_menudata
  • build_navigation_listdata
  • build_navigation_list
  • set_navigation_tab_main
  • set_navigation_tab_fallback
  • navigation_tab_complete
  • fb_like_button
  • showthread_complete
  • page_templates