최신 DEA-C02 무료덤프 - Snowflake SnowPro Advanced: Data Engineer (DEA-C02)
You are designing a data warehouse for an e-commerce company. One of the requirements is to provide fast analytics on order fulfillment times by region. You have two tables: 'ORDERS: Contains order information, including ID, 'ORDER DATE, 'REGION ID, and 'FULFILLMENT DATE. 'REGIONS': Contains region information, including 'REGION ID' and Due to the large size of the 'ORDERS' table and the complexity of calculating fulfillment times, you decide to use materialized views.
Which of the following combinations of materialized view definition and Snowflake features would BEST optimize query performance and minimize data staleness for this scenario? Choose two options.
Which of the following combinations of materialized view definition and Snowflake features would BEST optimize query performance and minimize data staleness for this scenario? Choose two options.
정답: C,E
설명: (DumpTOP 회원만 볼 수 있음)
You have implemented external tokenization for a sensitive data column in Snowflake using a UDF that calls an external API. After some time, you discover that the external tokenization service is experiencing intermittent outages, causing queries using the tokenized column to fail. What is the BEST approach to mitigate this issue and maintain data availability while minimizing the risk of exposing the raw data?
정답: C
설명: (DumpTOP 회원만 볼 수 있음)
A data engineer is investigating high credit consumption on a Snowflake warehouse due to frequent re-clustering operations on a large table named 'WEB EVENTS. This table is clustered on 'EVENT TIMESTAMP' and 'USER ID. The engineer suspects that the high frequency of data ingestion, especially out-of-order 'EVENT TIMESTAMP' values, contributes to the poor clustering. Choose the options that can lead to optimizing clustering and reducing credit consumption, assuming you have limited control over the ingestion process and data quality.
정답: A,B
설명: (DumpTOP 회원만 볼 수 있음)
You're designing a Snowpark Scala stored procedure that must execute a series of complex data quality checks on a Snowflake table.
These checks involve multiple steps, including validating data types, checking for null values, and verifying data consistency against external reference data'. You want to ensure that the stored procedure is resilient to errors, provides detailed logging, and can be easily monitored. Which of the following approaches would be the MOST robust and scalable for handling errors and logging within this Snowpark Scala stored procedure?
These checks involve multiple steps, including validating data types, checking for null values, and verifying data consistency against external reference data'. You want to ensure that the stored procedure is resilient to errors, provides detailed logging, and can be easily monitored. Which of the following approaches would be the MOST robust and scalable for handling errors and logging within this Snowpark Scala stored procedure?
정답: D
설명: (DumpTOP 회원만 볼 수 있음)
You are designing a data loading process for a high-volume streaming data source. The data arrives as Avro files in an AWS S3 bucket. You need to load this data into a Snowflake table with minimal latency and operational overhead. Which of the following combinations of Snowflake features and configurations would be MOST suitable for this scenario? (Select TWO)
정답: B,D
설명: (DumpTOP 회원만 볼 수 있음)
You have configured a Kafka Connector to load JSON data into a Snowflake table named 'ORDERS. The JSON data contains nested structures. However, Snowflake is only receiving the top- level fields, and the nested fields are being ignored. Which configuration option within the Kafka Connector needs to be adjusted to correctly flatten and load the nested JSON data into Snowflake?
정답: B
설명: (DumpTOP 회원만 볼 수 있음)
You have a Snowflake table 'orders_raw' with a VARIANT column named 'order detailS that contains an array of order items represented as JSON objects. Each object has 'item id', 'quantity' , and 'price'. You need to calculate the total revenue for each order. Which SQL statement efficiently flattens the array and calculates the total revenue using LATERAL FLATTEN and appropriate casting?


정답: A
설명: (DumpTOP 회원만 볼 수 있음)
Your company utilizes Snowflake Streams and Tasks for continuous data ingestion and transformation. A critical task, 'TRANSFORM DATA', consumes data from a stream 'RAW DATA STREAW on table 'RAW DATA' and loads it into a reporting table 'REPORTING TABLE. You observe that 'TRANSFORM DATA is failing intermittently with a 'Stream is stale' error. What steps can you take to diagnose and resolve this issue? Choose all that apply.
정답: B,C
설명: (DumpTOP 회원만 볼 수 있음)
You are the provider of a data product on the Snowflake Marketplace. You need to grant a trial access to a potential consumer You want to provide limited access for 7 days to specific tables in your database. Which of the following steps are REQUIRED to accomplish this?
(Select all that apply)
(Select all that apply)
정답: A
설명: (DumpTOP 회원만 볼 수 있음)
You have a Snowflake Task that is designed to transform and load data into a target table. The task relies on a Stream to detect changes in a source table. However, you notice that the task is intermittently failing with a 'Stream STALE' error, even though the data in the source table is continuously updated. What are the most likely root causes and the best combination of solutions to prevent this issue? (Select TWO)
정답: A,E
설명: (DumpTOP 회원만 볼 수 있음)
You are working with a Snowflake table 'customer_data' which contains customer information stored in a VARIANT column named raw_info'. The 'raw_info' JSON structure includes nested addresses, and preferences. Your task is to extract the city from the first address in the 'addresses' array, and the customer's preferred communication method from the 'preferences' object. Some customers might not have addresses or preferences defined. Select the two SQL snippets that correctly and efficiently extract this data, handling missing fields gracefully and providing appropriate type casting. Address array is in the format 'addresses: [ { 'city': '...', 'state': ' '},


정답: A,B
설명: (DumpTOP 회원만 볼 수 있음)
A data engineer is using Snowpark Python to build a data pipeline. They need to define a UDF that uses a pre-trained machine learning model stored as a file in a Snowflake stage. The UDF should receive batches of data for scoring. Which of the following is the MOST efficient way to implement this, minimizing data transfer and execution time?
정답: B,E
설명: (DumpTOP 회원만 볼 수 있음)