2013年12月15日 星期日

建立 SSIS 專案

要建立倉儲資料,就一定會用到資料轉移。 要轉移資料,使用匯出匯入精靈是最方便的,但是如果有下列情況,它就變的不太好用。 例如:轉移資料前必須做適當的資料修改或資料轉換,或者轉移的資料必須與目標中已存在的資料做合併。 而 SSIS 正是用來協助執行資料轉移的好工具。

Introducing Control Flow, Data Flow, and Connection Managers

Before you dive into SSIS development, you should be familiar with three essential elements of every SSIS package:

Connection managers

Connection managers provide connections to data stores and allow the connection to be defined once and used many times in the same package

Control flow

A package can consist of one or more operations, represented by control flow tasks. Execution order is defined by how individual tasks are connected to one another.

Data flow

Encapsulates the data movement components : ETL

  • One or more source components, designating the data stores from which the data will be extracted
  • One or more destination components, designating the data stores into which the data will be loaded
  • One or more (optional) transformation components, designating the transformations through which the data will be passed

簡單講,
Connection managers 定義 data sources與 data destinations 的連線設定,
Control flow 用來控管 SSIS process 的作業流程,
Data flow 則是真正執行 ETL 的核心部份。

Introducing SSIS Project Deployment

Typically, the principal difference between development and production environments is in the configuration of data stores. In development, all data can reside on the same server and could easily be placed on the developer's personal computer. Therefore, you should account for the following differences between the development and the production environments when developing SSIS solutions:

  • Connections
  • Data platforms
  • Security

In previous versions of SQL Server, it was possible to configure SSIS packages by using a configuration file or by storing the configuration data in a table. In SQL Server 2012, the configuration feature is effectively replaced with parameterization.

設定「控制流程」

You can add a new SSIS package or an existing SSIS package to the SSIS project.

Add an Existing SSIS Package to the SSIS Project

Edit the SSIS Package Created by the SQL Server Import

上面例子,我們加入一個由匯入匯出精靈自動建立的package,你可以雙擊這個封裝,就可以看見以下畫面。

底下我們將修改這個 package 的內容,並且進行簡單的 Debug。

1. 檢查及變更工作屬性:

你可以雙擊「準備SQL工作1」,就可以由「工作編輯器」看到以下內容:

「準備SQL工作1」是一個「執行SQL工作」,所以重點在於 Connection 和 SQLStatement 的設定。

你也可以在 Property 視窗中看到其他的設定,如下面屬性就是用來控制當該步驟發生錯誤時,是否要停止整個 package 。

2. 加入新工作:

加入新的「執行SQL工作」,並將工作更名為「準備SQL工作2」

編輯「準備SQL工作2」,如下設定:此處我們在 SQLStatement 輸入 TRUNCATE TABLE SalesOrderHeader2

3. 變更優先順序條件約束(precedence constraint)

你可以依照下列步驟,設定「優先順序條件約束」

為什麼說這個綠色的線條叫做「優先順序條件約束」,你可以點選它之後,選擇編輯,你可以由此設定該工作的執行條件。

設定「資料流程」

你可以雙擊「資料流程工作 1」,Designer 會自動切換到「資料流程」頁簽,你可以看到以下內容:

「資料流程工作 1」是一個「資料流程工作」,是真正執行資料移轉的工作。 SSIS 強就強在這裡,你可以在這個工作中進行資料的「合併」、「多點傳送」、「轉換」等作業,處理完後再將結果移轉到目的端。 底下我們先簡單的新增一個移轉作業。

設定「連線」

At the bottom of the SSDT IDE, locate the Connection Managers pane, which provides access to the connection managers used by your SSIS package.

SSIS supports a variety of data stores :

NOTE

When using stored procedures or parameterized queries against a SQL Server database in an Execute SQL Task, consider using the ADO.NET data provider rather than OLE DB data provider:

  • With ADO.NET, you can use parameter names in queries, instead of question marks as parameter placeholders.
  • When you are using stored procedures, ADO.NET allows you to set the query type appropriately , you just provide the name of the procedure and define the parameters in the Task Editor and the query statement is assembled automatically.
  • ADO.NET has better support for data types compared to OLE DB (for example, Xml, Binary, and Decimal data types)

Connection Manager Scope

SSDT support two connection manager scope :

  • Package-scoped connection managers
    This connection only available in the context of the SSIS package in which they were created and cannot be reused by other SSIS packages in the same SSIS project.
  • Project-scoped connection managers
    This connection is available to all packages of the project in which they were created.
NOTE

If a package connection manager and a project connection manager use the same name, the package connection manager overrides the project connection manager.

Parameterization

To simplify SSIS package deployment and maintenance, you should plan to parameterize all connection managers.

For more detail , ref MSDN:Integration Services (SSIS) Parameters

新增「連線」

底下示範建立幾種不同的「連線」:

建立一般檔案連接管理員(Flat File Connection)

你可以 package 底下的「連接管理員」的空白處,在右鍵選單中選擇「新增一般檔案連接」。

一般檔案(Flat File)指的就是固定格式的檔案,如 .csv 格式。

你可以在資料行頁簽,設定使用的分隔符號。

轉換成 project-scoped 連線

若你此時開啟同一個專案底下的不同 package ,你會發覺先前建立的連線並沒有在這個 package 的「連接管理員」之中。 這是因為這個連線是在 package 底下被建立的,所以它預設的使用範圍是 package-scoped。

如果是由專案下的「連接管理員」建立,則預設的使用範圍預設是 project-scoped 。

若你要把它轉換成 project-scoped ,你可以這麼做:在建立 connection 的那個 package 中,點選 connection 的右鍵選單,選擇「轉換成專案連接」:

只要是 project-scoped 的「連接」,其名稱前就會多了「專案」二字:

你也可以將 project-scoped 轉換回 package-scoped 。

Run the SSIS Package in Debug Mode

You can run the package in debug mode by :

  • On the Debug menu, select Start Debugging
  • press F5 on the keyboard

After the package run completed, you can get the result diagram of the execution.

沒有留言:

張貼留言