최신 DAA-C01 무료덤프 - Snowflake SnowPro Advanced: Data Analyst Certification

문제1

A Snowflake data analyst is tasked with optimizing the performance of a frequently executed query. The execution plan reveals a 'TableScan' operation on a large table named 'SALES DATR. The 'SALES DATA table is clustered on the 'SALE DATE' column. However, the query predicate uses a range filter on a different column, 'REGION', which is not part of the clustering key. Which of the following strategies would likely improve the query performance significantly? (Select TWO)

A. Create a search optimization service on the 'REGION' column of the 'SALES_DATA' table.

B. Recluster the table using both 'SALE_DATE' and 'REGION' as clustering keys to improve data locality for the filter.

C. Create a new clustering key using 'REGION' and removing the 'SALE DATE clustering key. Dropping the existing clustering key before creating a new one.

D. Create a materialized view that includes both ' REGION' and necessary columns and only include the data from last one year to reduce table size in the view.

E. Increase the virtual warehouse size used for querying the 'SALES DATA' table.

정답: A,D

설명: (DumpTOP 회원만 볼 수 있음)

문제2

Consider the following Snowflake table schema and data: 'CREATE TABLE products (product_id INTEGER, product_name VARCHAR, properties VARIANT);' Data: 'INSERT INTO products VALUES (1, 'Laptop', "silver", "storage": "512GB", "price": 1200.00}'));' 'INSERT INTO products VALUES (2, 'Mouse', "wireless", "dpi": 1600, "price": 25.00}'));' 'INSERT INTO products VALUES (3, 'Keyboard', PARSE JSON('{"layout": "US", "backlit": true, "price": Which of the following SQL queries will return the 'product_name' and 'price' for all products where the 'price' is greater than 50, ensuring that the 'price' is treated as a numeric value for comparison? Select all that apply

A.

B.

C.

D.

E.

정답: D,E

설명: (DumpTOP 회원만 볼 수 있음)

문제3

You've identified a 'Filter' operation in a Snowflake query execution plan that is consuming a significant amount of time. The filter predicate involves a UDF (User-Defined Function) called 'calculate_score(columnl, column2)'. The UDF is written in Python. Analyzing the plan, you observe a high number of rows being processed by this filter. How can you optimize this scenario for faster query execution?

A. Implement caching within the UDF to store previously calculated scores and reuse them for identical inputs.

B. Increase the warehouse size to provide more resources for UDF execution.

C. Replace the UDF with a regular expression to mimic the calculation to increase performance

D. Rewrite the UDF in SQL instead of Python to leverage Snowflake's native execution engine.

E. Create a materialized view that pre-calculates the score using the 'calculate_score' UDF and stores the results. The query should then filter on the materialized view.

정답: D,E

설명: (DumpTOP 회원만 볼 수 있음)

문제4

You are tasked with building a dashboard that visualizes website traffic data stored in Snowflake. The data includes daily unique visitors, bounce rate, and average session duration. The business stakeholders want to understand the correlation between these metrics. They also want to identify any outliers or anomalies. Which chart type is BEST suited for identifying correlation and outliers in this dataset?

A. A line chart showing each metric over time.

B. A bar chart comparing the average values of each metric.

C. A scatter plot matrix showing the pairwise relationships between all metrics.

D. A histogram showing the distribution of individual Metrics.

E. A pie chart showing the percentage contribution of each metric to the total.

정답: C

설명: (DumpTOP 회원만 볼 수 있음)

문제5

You have a table named 'SALES DATA' containing daily sales records. You need to identify and handle outliers in the 'SALES AMOUNT' column. Specifically, you want to replace any 'SALES AMOUNT values that fall outside of three standard deviations from the mean with the median 'SALES AMOUNT. What is the most efficient way to achieve this data transformation in Snowflake?

A. Export the data to an external system, perform the outlier detection and replacement there, and then load the cleaned data back into Snowflake.

B. Use a single 'CREATE OR REPLACE TABLE AS SELECT (CTAS) statement with window functions and a 'CASE expression to calculate the mean, standard deviation, and median, and replace outliers in the same query.

C. Create a stored procedure to iterate through each row and check for outliers, updating the 'SALES_AMOUNT accordingly.

D. Clone the table, calculate the mean, stddev and median on the original table, and use it in CASE statement, and create new sales amount column for identifying outliers and insert into the cloned table.

E. Calculate mean and standard deviation using separate queries, then use an UPDATE statement with a WHERE clause to identify and replace outliers.

정답: B

설명: (DumpTOP 회원만 볼 수 있음)

문제6

You are tasked with creating a data access strategy for a marketing analytics team. They need access to customer purchase data, but only aggregated by region and product category. They should not be able to see individual customer details due to PII compliance. You decide to use a Secure View. Which of the following are the MOST appropriate steps to ensure data security and minimize performance impact?

A. Create a Materialized View directly on the base tables with the aggregation logic. Grant SELECT privilege on the Materialized View to the marketing analytics role.

B. Create a Secure View directly on the base tables with the aggregation logic. Grant SELECT privilege on the view to the marketing analytics role.

C. Create a Materialized View that aggregates the data. Create a Secure View on top of the Materialized View and grant SELECT privilege on the secure view to the marketing analytics role.

D. Create a regular view that aggregates the data and grant SELECT privilege on the view to the marketing analytics role.

E. Create a Secure View that aggregates the data and grant SELECT privilege on the view to the marketing analytics role.

정답: E

설명: (DumpTOP 회원만 볼 수 있음)

문제7

You have a large dataset of IoT sensor readings stored in compressed JSON files within an AWS S3 bucket. Each JSON file contains an array of sensor readings with the following structure:
You need to load this data into a Snowflake table named 'sensor data' with columns 'sensor id', 'timestamp', 'temperature', and 'humidity'. Which of the following Snowflake commands would be the MOST efficient and appropriate to ingest this data, assuming you have already created the table and a named stage pointing to the S3 bucket?

A. Option E

B. Option D

C. Option C

D. Option B

E. Option A

정답: E

설명: (DumpTOP 회원만 볼 수 있음)

문제8

You are designing a data pipeline in Snowflake that ingests data from multiple external sources with varying schemas and data quality. After ingestion, you need to standardize the data format, handle missing values, and perform data type conversions before loading it into your analytical tables. You need to implement a reusable and maintainable solution. Which approach minimizes code duplication and maximizes data quality?

A. Ingest the data into a single large table without any transformation, and rely on business intelligence tools to handle data cleaning and transformation during analysis.

B. Use Snowflake's pipes and Snowpipe to load raw data into staging tables, then use a combination of dynamic SQL, user-defined functions (UDFs), and stored procedures to perform the data cleaning and transformation in a modular and reusable manner.

C. Create separate SQL scripts for each data source to handle the specific data cleaning and transformation requirements.

D. Implement a centralized stored procedure that accepts the data source name as a parameter and performs all data cleaning and transformation logic based on conditional statements (CASE statements).

E. Use Snowflake's external tables to directly query the data in its raw format and perform the data cleaning and transformation on-the-fly during query execution.

정답: B

설명: (DumpTOP 회원만 볼 수 있음)

문제9

You are analyzing customer churn for a subscription-based service. You have a table 'SUBSCRIPTIONS' with columns: 'CUSTOMER_ID, 'START_DATE', 'END_DATE', 'SUBSCRIPTION TYPE, and 'REVENUE'. You want to classify customers who are likely to churn based on their past subscription behavior. Which Snowflake SQL code snippet is MOST efficient for calculating the number of months each customer was subscribed and identifying those who subscribed for less than 3 months as potential churn candidates?

A.

B.

C.

D.

E.

정답: E

설명: (DumpTOP 회원만 볼 수 있음)

문제10

You are using a Snowflake Marketplace data feed that provides daily stock prices. The data is updated daily, and you need to create a process to automatically load the new data into your existing 'STOCK PRICES' table. The Marketplace data feed provides a view called 'MARKETPLACE STOCK PRICES' with columns 'DATE' (DATE), 'SYMBOL' (VARCHAR), and 'PRICE (NUMBER). Your 'STOCK PRICES' table has the same columns. Which of the following Snowflake features or techniques would be BEST suited for automatically loading the new data each day, ensuring that duplicate entries for the same 'DATE and 'SYMBOL' are avoided?

A. Use Snowpipe Auto-Ingest with an external stage pointing to the Snowflake Marketplace data feed. Configure the pipe to copy new files from the Marketplace's internal stage to your table.

B. Create a Snowflake Task that runs daily and executes a 'MERGE statement to insert new data from into 'STOCK PRICES' , updating existing records if a match exists based on and 'SYMBOLS and inserting new records if there is no match. Also add a condition to remove the duplicate entries for the same 'DATE and ' SYMBOL'

C. Create a Snowflake Task that runs daily and executes a simple 'INSERT INTO STOCK_PRICES SELECT FROM statement.

D. Create a Stream on the 'MARKETPLACE STOCK PRICES view, and then create a Snowflake Task that runs daily and processes the stream, inserting only new or changed data into the table. Schedule the Stream refresh frequency to once every day to match the new data feed frequency.

E. Create a Snowflake Pipe that automatically ingests data from the view into the 'STOCK_PRICES table whenever new data is available.

정답: B

설명: (DumpTOP 회원만 볼 수 있음)

문제11

A data analyst needs to process a large JSON payload stored in a VARIANT column named 'payload' in a table called 'raw events' The payload contains an array of user sessions, each with potentially different attributes. Each session object in the array has a 'sessionld' , 'userld' , and an array of 'eventS. The events array contains objects with 'eventType' and 'timestamp'. The analyst wants to use a table function to flatten this nested structure into a relational format for easier analysis. Which approach is most efficient and correct for extracting and transforming this data?

A. Load the JSON data into a temporary table, then write a series of complex SQL queries with JOINs and UNNEST operations to flatten the data.

B. Employ a combination of LATERAL FLATTEN and Snowpark DataFrames, using LATERAL FLATTEN to partially flatten the JSON and then Snowpark to handle the remaining complex transformations and data type handling.

C. Utilize a Snowpark DataFrame transformation with multiple 'explode' operations and schema inference to flatten the nested structure and load data into a new table.

D. Create a recursive UDF (User-Defined Function) in Python to traverse the nested JSON and return a structured result, then call this UDF in a SELECT statement.

E. Use LATERAL FLATTEN with multiple levels of nesting, specifying 'path' for each level and directly selecting the desired attributes.

정답: E

설명: (DumpTOP 회원만 볼 수 있음)

문제12

You have developed a Snowsight dashboard for your marketing team that contains sensitive customer data'. You need to share this dashboard with a specific group of users, but ensure that they can only view the data and cannot modify the dashboard itself or the underlying queries. Which of the following steps should you take to securely share the dashboard?

A. Share the dashboard with the group using the 'Can view' permission. Ensure the group has the 'SELECT' privilege on the tables/views used in the queries and the ' USAGE privilege on the database and schema. Also ensure that any intermediate tables created by the dashboard are also granted these privleges.

B. Share the dashboard with the group using the 'Can view' permission. Then, grant the users in the group the 'MONITOR privilege on the virtual warehouse used by Snowsight.

C. Create a scheduled task to export the dashboard as a PDF and email it to the group on a daily basis.

D. Share the dashboard with the group using the 'Can edit' permission. Then, grant the users in the group the 'USAGE' privilege on the database and schema containing the data.

E. Share the dashboard with 'Can view' permission and no further action is required.

정답: A

설명: (DumpTOP 회원만 볼 수 있음)

최신 DAA-C01 무료덤프 - Snowflake SnowPro Advanced: Data Analyst Certification

우리와 연락하기

유용한 링크

최신 업데이트