Dynamic Websites: Scraping data fields and Tables #517
Unanswered
alpha-code2019
asked this question in
Q&A
Replies: 1 comment
-
A bit too long of a question, no one will really take the effort to understand all the writing, you will have to make the effort to break it down into smaller pieces. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I hope, that the community does not mind me asking many questions - I am new to Automa and I try to understand of how to best work with this really great app!
Basic Info
I try to scrape data from the website https://bisbuysell,com
There are at least 19 data fields, that I want to scrape.
The data field value can be: text, number, date, link
Main Problem
The person who posts a business to sell is presented with a data entry form
If this person leaves a data entry field empty, then this data field is not shown in the sell presentation web page for this business at all
Only the data fields that have a value are shown and only these can be scraped
Question 1:
The workflow of Automa (at least as I experienced it until now) is as following:
---> Each scraped value is saved, if I wish (=click respective checkbox OK) as a variable and/or in a table
My goal is to export the scraped data to a Notion Data Base and/or an MS Excel table. Both of these tables show all 41 data fields as column headers
Is there a way to tell Automa in which column to export the scraped data = DATA MAPPING
Question 2:
Automa exports to:
The possible export formats are:
The problem is, that TEXT is scraped too, and the text contains all possible "CSV Separators" as part of the text. Which format is best to export the data in a way that it can be imported by MS Excel and Notion Database
Question 3:
So far, in checking many BizBuySale.com webpages, I identified 41 data fields that could be scraped, if they contain a value.
The UI to create a new Automa Workflow is graphically designed and very good and intuitive. But it has it limitation when it comes to 41 data fields that need to be scraped and that need to add a Conditional Block before each "Get Text" block
Is there a faster/easier way to define these blocks? Two blocks for each data field, multiplied by 41 data fields - that sums up to 82 blocks to be created.
Forum Discussion - Dynamic Websites - 20220530.pdf
Question 4:
I have uploaded the file "XPath & CSS Selectoren.ods" which contains the XPath and CSS Path for all the 41 data fields. I have collected these path definitions by using the Automa Selector feature/function
I can't see a system in that data? Does it make sense?
XPath & CSS Selectoren.ods
Thank you for looking into these questions and I hope that I can get some answers?
Beta Was this translation helpful? Give feedback.
All reactions