최신 DAA-C01 무료덤프 - Snowflake SnowPro Advanced: Data Analyst Certification
You are designing a system to ingest data from a high-volume sensor network. The sensors send data in a custom binary format to an on-premise message queue (e.g., RabbitMQ). The data needs to be converted to a structured format (e.g., JSON) before being loaded into Snowflake. Choose the most effective approach to ensure data integrity, scalability, and near-real-time ingestion.
정답: B,E
설명: (DumpTOP 회원만 볼 수 있음)
You are tasked with aggregating website clickstream data in Snowflake to identify the most popular product categories per region on a daily basis. The clickstream data is stored in a table named 'clickstream eventS with columns: 'event_time', 'user id', 'product id', 'region', and 'category'. You need to create a solution that efficiently identifies the top 3 categories for each region on each day. Which approach offers the best performance and scalability considering the dataset size is expected to grow significantly?
정답: E
설명: (DumpTOP 회원만 볼 수 있음)
You are tasked with building a report to analyze customer churn. The data includes customer demographics, purchase history, website activity, and support interactions. You want to provide interactive filtering capabilities within the report, allowing users to explore the data based on various criteria'. Which Snowflake feature(s) offer the most effective way to implement interactive filtering within your reporting solution without directly exposing underlying tables?
정답: C
설명: (DumpTOP 회원만 볼 수 있음)
You're working with a 'WEB EVENTS' table in Snowflake that stores user activity data'. The table includes columns like 'USER ID' , 'EVENT TIMESTAMP', 'EVENT TYPE (e.g., 'page_view', 'button_click', 'form_submission'), and 'EVENT DETAILS' (a VARIANT column containing JSON data specific to each event type). You need to identify users who submitted a specific form ('contact_us') more than 3 times within a 24-hour period. However, you are concerned about data quality, and the 'EVENT TIMESTAMP' column might contain duplicate entries for the same user and event. Which of the following SQL queries is the MOST robust and efficient way to achieve this in Snowflake, ensuring that duplicate timestamps for the same user and 'contact_us' form submission are not counted multiple times?


정답: D
설명: (DumpTOP 회원만 볼 수 있음)
Your organization stores clickstream data in Parquet files in an external stage 's3://your-bucket/clickstreamP. The data includes nested JSON structures representing user activity. You need to create a Snowflake table to query this data efficiently, extracting specific fields from the nested JSON. The challenge is to optimize query performance by leveraging Parquet's columnar storage and schema evolution capabilities. Which of the following approaches offers the BEST combination of performance and flexibility for querying the data in Snowflake, considering potential schema changes in the Parquet files over time?
정답: B
설명: (DumpTOP 회원만 볼 수 있음)
You have a table named 'event_data' that tracks user activities. The table contains 'event_id' (INT), 'user _ id' (INT), (TIMESTAMP NTZ), 'event_type' (VARCHAR), and 'event_details' (VARIANT). The table is partitioned by Performance on queries filtering by both 'event_type' and a specific date range on is slow You suspect inefficient partition pruning and JSON parsing as potential bottlenecks. Which combination of actions will most effectively address these performance issues?
정답: A
설명: (DumpTOP 회원만 볼 수 있음)
You have a Snowflake table named 'sensor_data' with a column 'reading' containing JSON data'. The JSON structure varies, but you want to extract a specific nested value, 'temperature', using a UDE The path to 'temperature' might be different depending on the 'sensor_type'. Some sensors have the temperature at '$.metrics.temperature' , others at '$.reading.temp_c'. The sensor type is stored in the 'sensor_type' column. You want to create a UDF named which takes the JSON 'reading' and the 'sensor_type' as input and extracts the temperature, returning NULL if the path does not exist in the JSON. How can you implement this using a JavaScript UDF and Snowflake's JSON parsing functions for optimal performance?


정답: C
설명: (DumpTOP 회원만 볼 수 있음)
Consider a table 'sales data' with columns 'product id', 'sale date', and 'revenue'. You need to calculate the cumulative revenue for each product over time, but only for the top 10 products by total revenue. What is the most efficient way to achieve this in Snowflake?
정답: E
설명: (DumpTOP 회원만 볼 수 있음)
You're using Snowsight to build a dashboard for monitoring website performance. The data is in a table called 'WEB EVENTS' with columns: 'EVENT _ TIME' (TIMESTAMP_NTZ), 'EVENT _ TYPE' (VARCHAR, e.g., 'page_view', 'button_click'), 'USER_ID' (VARCHAR), and 'PAGE URL' (VARCHAR). You want to create a tile that shows the average time between consecutive 'page_view' events for each user over the last 7 days. This will help you understand how users are navigating the site. Assume that for a single user, page_view events are ordered by EVENT TIME. Which of the following SQL queries, when used as the basis for a Snowsight tile, will correctly calculate this average time difference in seconds?


정답: E
설명: (DumpTOP 회원만 볼 수 있음)
A data analyst needs to process a large JSON payload stored in a VARIANT column named 'payload' in a table called 'raw events' The payload contains an array of user sessions, each with potentially different attributes. Each session object in the array has a 'sessionld' , 'userld' , and an array of 'eventS. The events array contains objects with 'eventType' and 'timestamp'. The analyst wants to use a table function to flatten this nested structure into a relational format for easier analysis. Which approach is most efficient and correct for extracting and transforming this data?
정답: E
설명: (DumpTOP 회원만 볼 수 있음)
A marketing team needs a daily report showing the conversion rate of leads to customers. They define conversion rate as (Number of Customers Acquired / Total Number of Leads) 100. The data resides in two tables: 'LEADS' and 'CUSTOMERS'. 'LEADS' contains all leads generated daily, and 'CUSTOMERS' contains all acquired customers, both tables having a 'LEAD_ID' and 'ACQUISITION DATE' field, with 'ACQUISITION DATE' being NULL in the 'LEADS' table. They want the report automated and delivered via email. Which combination of Snowflake features would BEST accomplish this task?
정답: D
설명: (DumpTOP 회원만 볼 수 있음)
You observe a 'Spilling to Local Storage' event in the Query Profile of a complex aggregation query. This indicates that the intermediate results are exceeding the memory capacity of the virtual warehouse. Which of the following actions would be MOST effective in mitigating this issue and improving the query performance?
정답: A,B,D
설명: (DumpTOP 회원만 볼 수 있음)