Sunday, October 31, 2010

SSIS related Interview Questions with answers

Here are some SSIS related Interview Questions with answers. hope they help.

1) What is the control flow
2) what is a data flow
3) how do you do error handling in SSIS
4) how do you do logging in ssis
5) how do you deploy ssis packages.
6) how do you schedule ssis packages to run on the fly
7) how do you run stored procedure and get data
8) A scenario: Want to insert a text file into database table, but during the upload want to change a column called as months - January, Feb, etc to a code, - 1,2,3.. .This code can be read from another database table called months. After the conversion of the data , upload the file. If there are any errors, write to error table. Then for all errors, read errors from database, create a file, and mail it to the supervisor.
How would you accomplish this task in SSIS?
9)what are variables and what is variable scope ?
Answers
For Q 1 and 2:
In SSIS a workflow is called a control-flow. A control-flow links together our modular data-flows as a series of operations in order to achieve a desired result.

A control flow consists of one or more tasks and containers that execute when the package runs. To control order or define the conditions for running the next task or container in the package control flow, you use precedence constraints to connect the tasks and containers in a package. A subset of tasks and containers can also be grouped and run repeatedly as a unit within the package control flow.

SQL Server 2005 Integration Services (SSIS) provides three different types of control flow elements: containers that provide structures in packages, tasks that provide functionality, and precedence constraints that connect the executables, containers, and tasks into an ordered control flow.

A data flow consists of the sources and destinations that extract and load data, the transformations that modify and extend data, and the paths that link sources, transformations, and destinations. Before you can add a data flow to a package, the package control flow must include a Data Flow task. The Data Flow task is the executable within the SSIS package that creates, orders, and runs the data flow. A separate instance of the data flow engine is opened for each Data Flow task in a package.

SQL Server 2005 Integration Services (SSIS) provides three different types of data flow components: sources, transformations, and destinations. Sources extract data from data stores such as tables and views in relational databases, files, and Analysis Services databases. Transformations modify, summarize, and clean data. Destinations load data into data stores or create in-memory datasets.
Q3:
When a data flow component applies a transformation to column data, extracts data from sources, or loads data into destinations, errors can occur. Errors frequently occur because of unexpected data values.

For example, a data conversion fails because a column contains a string instead of a number, an insertion into a database column fails because the data is a date and the column has a numeric data type, or an expression fails to evaluate because a column value is zero, resulting in a mathematical operation that is not valid.

Errors typically fall into one the following categories:

-Data conversion errors, which occur if a conversion results in loss of significant digits, the loss of insignificant digits, and the truncation of strings. Data conversion errors also occur if the requested conversion is not supported.
-Expression evaluation errors, which occur if expressions that are evaluated at run time perform invalid operations or become syntactically incorrect because of missing or incorrect data values.
-Lookup errors, which occur if a lookup operation fails to locate a match in the lookup table.

Many data flow components support error outputs, which let you control how the component handles row-level errors in both incoming and outgoing data. You specify how the component behaves when truncation or an error occurs by setting options on individual columns in the input or output.

For example, you can specify that the component should fail if customer name data is truncated, but ignore errors on another column that contains less important data.

Q 4:
SSIS includes logging features that write log entries when run-time events occur and can also write custom messages.

Integration Services supports a diverse set of log providers, and gives you the ability to create custom log providers. The Integration Services log providers can write log entries to text files, SQL Server Profiler, SQL Server, Windows Event Log, or XML files.

Logs are associated with packages and are configured at the package level. Each task or container in a package can log information to any package log. The tasks and containers in a package can be enabled for logging even if the package itself is not.

To customize the logging of an event or custom message, Integration Services provides a schema of commonly logged information to include in log entries. The Integration Services log schema defines the information that you can log. You can select elements from the log schema for each log entry.

To enable logging in a package
1. In Business Intelligence Development Studio, open the Integration Services project that contains the package you want.
2. On the SSIS menu, click Logging.
3. Select a log provider in the Provider type list, and then click Add.
Q 5 :

SQL Server 2005 Integration Services (SSIS) makes it simple to deploy packages to any computer.
There are two steps in the package deployment process:
-The first step is to build the Integration Services project to create a package deployment utility.
-The second step is to copy the deployment folder that was created when you built the Integration Services project to the target computer, and then run the Package Installation Wizard to install the packages.
Q 9 :

Variables store values that a SSIS package and its containers, tasks, and event handlers can use at run time. The scripts in the Script task and the Script component can also use variables. The precedence constraints that sequence tasks and containers into a workflow can use variables when their constraint definitions include expressions.

Integration Services supports two types of variables: user-defined variables and system variables. User-defined variables are defined by package developers, and system variables are defined by Integration Services. You can create as many user-defined variables as a package requires, but you cannot create additional system variables.

Scope :

A variable is created within the scope of a package or within the scope of a container, task, or event handler in the package. Because the package container is at the top of the container hierarchy, variables with package scope function like global variables and can be used by all containers in the package. Similarly, variables defined within the scope of a container such as a For Loop container can be used by all tasks or containers within the For Loop container.


Question 1 - True or False - Using a checkpoint file in SSIS is just like issuing the CHECKPOINT command against the relational engine. It commits all of the data to the database.
False. SSIS provides a Checkpoint capability which allows a package to restart at the point of failure.

Question 2 - Can you explain the what the Import\Export tool does and the basic steps in the wizard?
The Import\Export tool is accessible via BIDS or executing the dtswizard command.
The tool identifies a data source and a destination to move data either within 1 database, between instances or even from a database to a file (or vice versa).


Question 3 - What are the command line tools to execute SQL Server Integration Services packages?
DTSEXECUI - When this command line tool is run a user interface is loaded in order to configure each of the applicable parameters to execute an SSIS package.
DTEXEC - This is a pure command line tool where all of the needed switches must be passed into the command for successful execution of the SSIS package.


Question 4 - Can you explain the SQL Server Integration Services functionality in Management Studio?
You have the ability to do the following:
Login to the SQL Server Integration Services instance
View the SSIS log
View the packages that are currently running on that instance
Browse the packages stored in MSDB or the file system
Import or export packages
Delete packages
Run packages

Question 5 - Can you name some of the core SSIS components in the Business Intelligence Development Studio you work with on a regular basis when building an SSIS package?
Connection Managers
Control Flow
Data Flow
Event Handlers
Variables window
Toolbox window
Output window
Logging
Package Configurations

Question Difficulty = Moderate

Question 1 - True or False: SSIS has a default means to log all records updated, deleted or inserted on a per table basis.
False, but a custom solution can be built to meet these needs.

Question 2 - What is a breakpoint in SSIS? How is it setup? How do you disable it?
A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an opportunity to review the status of the data, variables and the overall status of the SSIS package.
10 unique conditions exist for each breakpoint.
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the object where you want to set the breakpoint and select the 'Edit Breakpoints...' option.


Question 3 - Can you name 5 or more of the native SSIS connection managers?
OLEDB connection - Used to connect to any data source requiring an OLEDB connection (i.e., SQL Server 2000)
Flat file connection - Used to make a connection to a single file in the File System. Required for reading information from a File System flat file
ADO.Net connection - Uses the .Net Provider to make a connection to SQL Server 2005 or other connection exposed through managed code (like C#) in a custom task
Analysis Services connection - Used to make a connection to an Analysis Services database or project. Required for the Analysis Services DDL Task and Analysis Services Processing Task
File connection - Used to reference a file or folder. The options are to either use or create a file or folder
Excel
FTP
HTTP
MSMQ
SMO
SMTP
SQLMobile
WMI


Question 4 - How do you eliminate quotes from being uploaded from a flat file to SQL Server?
In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to ensure the quotes are not included.
Additional information: How to strip out double quotes from an import file in SQL Server Integration Services
Question 5 - Can you name 5 or more of the main SSIS tool box widgets and their functionality?
For Loop Container
Foreach Loop Container
Sequence Container
ActiveX Script Task
Analysis Services Execute DDL Task
Analysis Services Processing Task
Bulk Insert Task
Data Flow Task
Data Mining Query Task
Execute DTS 2000 Package Task
Execute Package Task
Execute Process Task
Execute SQL Task
etc.

Question Difficulty = Difficult

Question 1 - Can you explain one approach to deploy an SSIS package?
One option is to build a deployment manifest file in BIDS, then copy the directory to the applicable SQL Server then work through the steps of the package installation wizard
A second option is using the dtutil utility to copy, paste, rename, delete an SSIS Package
A third option is to login to SQL Server Integration Services via SQL Server Management Studio then navigate to the 'Stored Packages' folder then right click on the one of the children folders or an SSIS package to access the 'Import Packages...' or 'Export Packages...'option.
A fourth option in BIDS is to navigate to File | Save Copy of Package and complete the interface.



Question 2 - Can you explain how to setup a checkpoint file in SSIS?
The following items need to be configured on the properties tab for SSIS package:
CheckpointFileName - Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it's a good idea to use an expression that concatenates a path defined in a package variable and the package name.
CheckpointUsage - Determines if/how checkpoints are used. Choose from these options: Never (default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist.
SaveCheckpoints - Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior.

Question 3 - Can you explain different options for dynamic configurations in SSIS?
Use an XML file
Use custom variables
Use a database per environment with the variables
Use a centralized database with all variables

Question 4 - How do you upgrade an SSIS Package?
Depending on the complexity of the package, one or two techniques are typically used:
Recode the package based on the functionality in SQL Server DTS
Use the Migrate DTS 2000 Package wizard in BIDS then recode any portion of the package that is not accurate


Question 5 - Can you name five of the Perfmon counters for SSIS and the value they provide?
SQLServer:SSIS Service
SSIS Package Instances - Total number of simultaneous SSIS Packages running
SQLServer:SSIS Pipeline
BLOB bytes read - Total bytes read from binary large objects during the monitoring period.
BLOB bytes written - Total bytes written to binary large objects during the monitoring period.
BLOB files in use - Number of binary large objects files used during the data flow task during the monitoring period.
Buffer memory - The amount of physical or virtual memory used by the data flow task during the monitoring period.
Buffers in use - The number of buffers in use during the data flow task during the monitoring period.
Buffers spooled - The number of buffers written to disk during the data flow task during the monitoring period.
Flat buffer memory - The total number of blocks of memory in use by the data flow task during the monitoring period.
Flat buffers in use - The number of blocks of memory in use by the data flow task at a point in time.
Private buffer memory - The total amount of physical or virtual memory used by data transformation tasks in the data flow engine during the monitoring period.
Private buffers in use - The number of blocks of memory in use by the transformations in the data flow task at a point in time.
Rows read - Total number of input rows in use by the data flow task at a point in time.
Rows written - Total number of output rows in use by the data flow task at a point in time.

*************************************************************************
SSIS Interview Questions
1. What does a control flow do?
2. Generically explain what happens inside a data flow task?
3. Explain what ETL is?
4. Which task would you use to copy, move or delete files?
5. Which transform would you use to split your data based on conditions you define?
6. Explain the pros and cons of deploying to a file system vs msdb?
7. If you did not know the answer to a question what would be your next step to find the answer? 

8.what ia a breakpoint in ssis? How is it setup? How do you disable?
9.Can you rollback a transaction on SSIS? Explain step by step how?
10.What is the file extension of the SSIS Package ?
   
 1. About previou project
2. What kind of activities did u do as a DBA?
3. How do you do error handling in SSIS?
4. How do you do trouble shooting in case of change of request in SSIS?
5. How do you decide to go for either start schema or Snow flake schema?
6. Give a scenario where in you have designed a warehouse database for a existing OLTP database
7. what are slowly changing dimensions and explain a scenario where u have implemented?
8. Give a scenario where u have written a complex MDX
9. Give a scenario where u were involved in discussions with customer as a BA
10. How will you handle a SSIS ETL requirement and how will you give the tasks to your team members
11. What is data staging and validation?


Video ********************
http://perseus.franklins.net/dnrtv_0026.wmv
http://perseus.franklins.net/dnrtv_0027.wmv
http://perseus.franklins.net/dnrtv_0028.wmv

************************************
1. What is difference between merge and merge join
2. Which native transformation dose not have error flow redirection.
3. How one can handle transactions in ssis
4. What are pre requisites for handling transactions in SSIS.
5. What is difference between execute SQL task and OLE DB command
6. What is relevance of ? while configuring OLE DB transformation.
7. What is use of multicast 

***************************************************************
Q-How to Generate an Auto Incremental Number in a SSIS Package?
Auto incremental numbers in a SSIS package can be provided using script components. The script component should be dragged and dropped to the data flow and Transformation should be the component type..
 

Q-Lookup's are a key component in SQL Server Integration Services (SSIS). Explain its purpose
Lookup transformation combines data from two sources. The fields of these sources are matched. The lookups performed by the transformation are case sensitive................
Read answer

Q-How to unzip a File in SSIS?
Execute Process Task in the Control Flow task can be used to unzip a file. The execute process task needs to be dragged and dropped to the control flow followed by configuration of Executables to specify the path of application, Arguments to extract zip files (path) and Working directory

SQL Interview Questions  
What is X-path?
What is Validation? How do we do?
Difference between Group-by and Order-by?
What is FILLFACTOR?
How do we return value in Store proc?
In Store Proc how do we only give read permission?
Where do we write SQL commands in DTS?
How do we take a file from network for DTS?
How do we move DB from one server to another server?
What are DBCC commands?
What is difference between logical DB and Physical DB?
What are the difference between Cluster and Non-clusters?
What is Inner Join and Outer Join?
What are the naming conventions for index?
What is * schema and snow flake schema?
What is difference between Dimensional Table and Fact table?
What are the difference between DTD and XML?
What is XML file specification?
How do we download and upload data from XML file?


1) What is olap and oltp. do u know the difference b/w them??
2) What are the components of ssrs
3)what is notification service and where do u use it
4)what is ssis? diff b/w ssis and dts
5)what is sql profiler?
6)what is msdtc?
7)what is firewall in sql server? how do u change it??
8)what is a linked server?
9)what is sql injection?
10) do u know what is lock escalation??
11) what is clustering in sql server? Explain AA and AP modes? What happens when the common hard drive fails..
12) what are isolation levels??
13) how do u write extended store proc's?? difference b/w extended and regular store proc's
14)what is an orphan sessions??
15) what is a temp table?? How do u create and its advantages and disadvantages
16) what is a transaction log? What do u do if the transaction log growth is abnormal??

17) Can DTS be used to send mail??

18) what is a DTS connector. Name some for example.

19) How do u connect to another oracle server??

20) Is the data transfer between linked servers on demand or continuous??

What type of replication did you use and what are all the steps?
What is an index and how many types and what is the difference?
How can you find out a deadlock in a stored proc?
What are the steps we have to take for best performance?
What is DTS?
How do you take transactional back and how will you restore it, if is there any differential back up?
What is isolation?
When did you use triggers?
How many ways we can transport a database?
What are all the DBCC commands generally use in daily basis?
What are all the responsibilities in a daily basis?
How many servers in your company and what is the maximum size of the database you have involved? 25 server
Have you involved in production support?
Have you worked on OLTP and OLAP, and what are they?
When will dead locks occur?
How can you remove dead locks in a stored proc?

***************************************************
What are the Stored Proc you written?
What is the purpose of the Proc?
What is the difference between Indexes?
What are the disadvantages of Indexes?
Why do we use Identity columns?
If we delete a row in a table which has Identity column and if we want to insert in the same row with different row what should we do?
How do we populate first 10rows from a table?
What is a cursor and what are the disadvantages?
How do we check the performance of a stored proc?
How to get first 3 characters from a string?
How to remove blank spaces from a String?
How SQL Server tell us to use Index on a particular column?
You can use the SQL Server Profiler Create Trace Wizard with "Identify Scans of Large Tables" trace to determine which tables in your database may need indexes. This trace will show which tables are being scanned by queries instead of using an index.


What are triggers? Tell me where do you used in your previous project?
What is group-by and order-by where do you used in your stored proc?


What is a global variable?
Diff b/w scope_identity and @identity

They both will return a newly generated ID value but the difference is:

The @@IDENTITY function returns the last IDENTITY value that was generated in your connection
It returns the last IDENTITY value generated in the scope. So what is the scope? The scope is a batch of SQL code such as s stored procedure, a trigger, or a user defined function.

How do u make sure that a query does not return a null value?


What is Colaese stmt in T-Sql?

The COALESCE Function: A more efficient approach to creating dynamic WHERE clauses involves using the COALESCE function. This function returns the first non-null expression in its expression list.

Diff b/w delete and truncate?? Which one is better and why?
Diff b/w Select into and Create
Diff b/w where clause and having clause
Having Clause is basically used only with the GROUP BY function in a query. WHERE Clause is applied to each row before they are part of the GROUP BY function in a query.

What is a corelated query?
What is update statistics?
This command is basically used when we do a large processing of data. If we do a large amount of deletions any modification or Bulk Copy into the tables, we need to basically update the indexes to take these changes into account. UPDATE_STATISTICS updates the indexes on these tables accordingly.
What is left outer join?

**********************************************
How to generate reports automatically?
I have 5 reports how to generate these 5 reports automatically at end of the day

   

2 comments:

  1. http://arun-sqlbooks.blogspot.com/2009/03/ssisssrs-interview-questionswith.html

    ReplyDelete
  2. http://www.careerride.com/Interview-Questions-SQLServer-Reporting-services.aspx

    ReplyDelete