数据库类关系
正如“类关系”页面中所解释的那样,SmartEDB 提供了多种实现表之间关系(连接)的方法,每种方法在内存开销和性能方面都有其独特的优势和成本。以下各节将介绍 C# 开发人员可用的不同技术。
索引连接
在 SQL 中,两个表(Department 和 Employee)之间简单的“一对多”关系可以按如下方式实现:
select e.name, d.name from Employee e
inner join Department d d.dept_no = d.dept_no;
要实现像Department > Employee示例那样的一对多关系,我们可以这样定义两个数据库类:
[Persistent]
class Department
{
[Indexable(Type = Database.IndexType.BTree, Unique = true)]
public String code;
public String name;
[Indexable(Type = Database.IndexType.Hashtable, Unique = true)]
public int dept_no;
}
[Index("byDept_EmployeeName", Keys = new string[] { "dept_no", "name" }, Unique = true)]
[Persistent]
class Employee
{
[Indexable(Type = Database.IndexType.BTree, Unique = false)]
public String name;
public int dept_no;
}
在此,唯一哈希索引 dept_no 在 Department 类中充当“主键”,而唯一 B 树索引 byDept_EmployeeName 在 Employee 类中充当“外键”。请注意,byDept_EmployeeName 是由 Employee 类中的两个字段值组成的复合索引:dept_no 和 name。外键不一定是复合索引。仅在 dept_no 字段上创建一个简单的 B 树索引就足以形成连接,但使用复合索引,我们还具有额外的优势,即能够按 name 字段的字母顺序对具有相同 dept_no 值的所有实例进行排序。
为了展示连接是如何实现的,请考虑以下代码片段(取自 SDK 示例“csharp/Joins/Index_Join”),这些代码片段管理 Department 和 Employee 对象。首先,我们使用一些示例数据填充 Department 类:
// Create and insert Department objects
con.StartTransaction(Database.TransactionType.ReadWrite);
for (short i = 0; i < N_DEPARTMENTS; i++)
{
Department dept = new Department();
dept.code = DD[i].code;
dept.name = DD[i].name;
dept.dept_no = DD[i].dept_no;
con.Insert(dept); // Insert object to SmartEDB database
Console.WriteLine("\t" + i + ") " + dept.code + ", " + dept.name +
", dept_no=" + dept.dept_no);
}
con.CommitTransaction();
然后我们插入Employee对象,将它们与适当的Department对象关联起来:
con.StartTransaction(Database.TransactionType.ReadWrite);
// Find Department by code; extract dept_no; create Employee and assign name, dept_no
Cursor<Department> cursor = new Cursor<Department>(con, "code");
for (short i = 0; i < N_EMPLOYEES; i++)
{
Department dept = cursor.Find(ED[i].deptCode);
Employee emp = new Employee();
emp.dept_no = dept.dept_no;
emp.name = ED[i].name;
con.Insert(emp);
Console.WriteLine("\t" + i + ") " + emp.name + ", dept_no=" + emp.dept_no);
}
con.CommitTransaction();
cursor.Close();
请注意,我们在索引 Department.code 上实例化一个游标,并调用 Cursor.find() 方法定位到具有指定 deptCode 的 Department 对象;然后提取该 Department 的 dept_no 并存储到新的 Employee 对象中。这在后续查询中实现关联(关系)时,在 Department 对象和 Employee 对象之间创建了关联(关系)。现在,为了在(即“连接”)这两个类之间导航关系,我们可能会使用如下代码来显示给定员工所在部门的所有同事:
// Search for all Employee objects from a specified Employee's Department
String search_name = "William";
Console.Write("\n\n" + search_name + "'s co-workers in ");
con.StartTransaction(Database.TransactionType.ReadOnly);
// 1. Find the Employee object by name
Cursor<Employee> cursor1 = new Cursor<Employee>(con, "name");
Employee emp1 = cursor1.Find(search_name);
// 2. Find the Department object by its dept_no and display the Department name
Cursor<Department> cursor2 = new Cursor<Department>(con, "dept_no");
Department d = cursor2.Find(emp1.dept_no);
Console.Write(d.name + " are:\n");
// 3. Position the cursor in the byDept_EmployeeName compound index to the
// first object with this dept_no
con.StartTransaction(Database.TransactionType.ReadWrite);
Cursor<Employee> cursor3 = new Cursor<Employee>(con, "byDept_EmployeeName");
{
if (cursor3.Search(Operation.GreaterOrEquals, emp1.dept_no, ""))
{
foreach (Employee e in cursor3)
{
if(0 != e.dept_no.CompareTo(emp1.dept_no)) // Exit loop when Dept_no is
// no longer equal
{
break;
}
else if ( !String.Equals(e.name,search_name) ) // exclude serch_name
// from results
{
Console.WriteLine("\t" + e.name);
}
}
cursor2.MoveNext();
}
}
con.CommitTransaction();
请注意,我们使用索引 Employee.name 上的游标来查找具有指定名称的 Employee 对象;然后使用索引 Department.dept_no 上的游标通过找到的 Employee 的 dept_no 来查找 Department。接着,我们使用索引 Employee.byDept_EmployeeName 上的第三个游标定位到具有此 dept_no 的第一个对象,并在游标中滚动,直到 dept_no 不同为止。对于每个索引节点,我们从游标中获取 Employee 对象并检查其 name 字段。如果这不是我们正在查找同事的原始员工的名称,我们就显示该名称并调用 Cursor.MoveNext() 方法以转到索引树中的下一个节点。
优点和缺点
这是一种许多开发人员熟悉的典型“关系型”风格的连接。虽然它直观且易于实现,但管理多个索引会带来显著的开销。当提交插入事务时,为每个“部门”对象创建一个哈希索引和两个 B 树索引节点,为每个“员工”对象创建两个 B 树索引节点。这会导致内存消耗,而且除了数据库对象本身占用的空间外,B 树索引结构还需要“平衡”,这可能会消耗大量的处理器周期。因此,插入对象的整体性能不是最优的。 由于哈希索引 Idept_no 的存在,通过部门编号 dept_no 查找部门对象的操作效率很高。并且通过复合索引 Employee.byDept_EmployeeName
在员工对象上创建的游标有助于在索引树中“滚动”,从而按姓名字母顺序列出员工。
Autoid引用
上述索引连接示例的另一种技术是通过直接从“员工”对象引用其所属的“部门”来实现“外键”关系。一个自动 ID 字段可以实现此目的。例如,我们可能会将这两个数据库类定义如下:
[Persistent(AutoID = true)]
class Department
{
[Indexable(Type=Database.IndexType.BTree, Unique=true)] // Declare unique tree index by "code" field
public String code;
public String name;
}
[Persistent]
[Index("byDept_EmployeeName", Keys=new string[]{"dept","name"}, Unique=true)]
class Employee
{
[Indexable(Type = Database.IndexType.BTree, Unique = true)] // Declare unique tree index by "name" field
public String name;
[References(typeof(Department))]
public long dept;
}
请注意,Department 类不再具有 dept_no 字段,因此也没有基于该字段的索引。取而代之的是,它具有注解 (autoid=true)。这实际上会导致为 Department 对象的 autoid 值维护一个“隐藏”的哈希索引,随着新对象的创建,该值会自动递增。并且请注意,字段 dept 的类型为 long - 它将存储关联的 Department 对象的 autoid。
为了演示此连接的实现方式,请考虑以下代码片段(取自 SDK 示例“csharp/joins/Autoid_Ref”)。我们像上面的 Index_Join 示例那样填充 Department 类,但请注意,为了显示每个 Department 对象的 autoid,我们从 Connection.insert() 方法中获取它:
// Create and insert Department objects
con.StartTransaction(Database.TransactionType.ReadWrite);
Console.WriteLine("\nCreate Departments:\n");
for (short i = 0; i < N_DEPARTMENTS; i++)
{
Department dept = new Department();
dept.code = DD[i].code;
dept.name = DD[i].name;
long autoid = con.Insert(dept);
Console.WriteLine("\t" + i + ") " + dept.code + ", " + dept.name +", Autoid=" + autoid);
}
con.CommitTransaction();
当插入Employee对象时,我们使用Cursor.GetAutoId()方法从对应的Department对象中提取autoid,并将其分配给引用字段Employee.dept:
// Create and insert Employee objects
Console.WriteLine("\nCreate Employees and join each to a Department:\n");
con.StartTransaction(Database.TransactionType.ReadWrite);
Cursor<Department> cursor = new Cursor<Department>(con, "code");
for (short i = 0; i < N_EMPLOYEES; i++)
{
// Find the Department object for this Employee by the BTree index on
// Department.code
cursor.Find(ED[i].deptCode);
Employee emp = new Employee();
emp.name = ED[i].name;
// Assign the Department autoid for this Department object and insert new
// Employee object
emp.dept = cursor.GetAutoId(); // Note that the method GetAutoId is of the
// Cursor class
con.Insert(emp);
Console.WriteLine("\t" + i + ") " + emp.name + ", Department.Autoid=" + emp.dept);
}
con.CommitTransaction();
cursor.Close();
要显示给定员工所在部门的所有同事,其代码几乎与上面的 Index_Join 示例相同,唯一的区别在于我们在 Department 类上实例化游标时未指定索引字段。然后,我们通过调用 find 并传入 Employee.dept 的值来查找由其自动生成的主键标识的 Department 对象。这相当于在 Department.dept_no 索引上调用 Cursor.find(),只不过 dept_no 索引使用的是哈希索引,而自动生成的主键是自动产生的,而 dept_no 字段则必须指定一个唯一的值。同样的通过在复合索引 byDept_EmployeeName 上滚动游标的技术用于按字母顺序显示员工姓名:
// Search for all Employee objects from a specified Employee's Department
String search_name = "William";
Console.Write("\n\n" + search_name + "'s co-workers in ");
con.StartTransaction(Database.TransactionType.ReadOnly);
// 1. Find the Employee object by name
Cursor<Employee> cursor1 = new Cursor<Employee>(con, "name");
Employee emp1 = cursor1.Find(search_name);
// 2. Find the Department object by its autoid and display the Department name
Cursor<Department> cursor2 = new Cursor<Department>(con);
Department d = cursor2.Find(emp1.dept);
Console.Write(d.name + " are:\n");
// 3. Position the cursor in the byDept_EmployeeName compound index to the
// first object with this Department Autoid
Cursor<Employee> cursor3 = new Cursor<Employee>(con, "byDept_EmployeeName");
{
if (cursor3.Search(Operation.GreaterOrEquals, emp1.dept, ""))
{
foreach (Employee e in cursor3)
{
if (0 != e.dept.CompareTo(emp1.dept)) // Exit loop when Department
// autoid is no longer equal
{
break;
}
else if (!String.Equals(e.name, search_name)) // exclude serch_name from results
{
Console.WriteLine("\t" + e.name);
}
}
cursor3.MoveNext();
}
}
con.CommitTransaction();
优点和缺点
这种实现方式通过省去字段 dept_no 而节省了一些内存,而且由于为自增 ID 生成了哈希索引,查找 Department 对象的效率也很高。但若放宽应用程序的要求,或许还能获得一些优化。
请注意,无论是这种关联方式还是索引连接的关联方式,其性能和内存消耗的主要瓶颈都在于 B 树索引:Employee.byDept_EmployeeName、Employee.name 和 Department.name。例如,如果无需通过名称查找 Department 对象,则可以删除 Department.Name 索引。再者,如果无需按字母顺序显示给定部门的员工,则可以删除复合索引 Employee.byDept_EmployeeName。以下示例说明了这种方法。
autoids向量
我们可以在Department对象中存储Employee对象的自动id数组或向量,而不是在Employee对象中存储Employee对象的自动id。例如(参见SDK示例“csharp/joins/Autoid_Vector”),我们可以这样定义数据库类:
[Persistent(AutoID = true)]
class Department
{
[Indexable(Type=Database.IndexType.BTree, Unique=true)] // Declare unique tree index by "code" field
public String code;
public String name;
[References(typeof(Employee))]
public long[] employees;
}
[Persistent(AutoID = true)]
class Employee
{
[Indexable(Type = Database.IndexType.BTree, Unique = true)] // Declare unique tree index by "name" field
public String name;
[References(typeof(Department))]
public long dept;
}
请注意,这两个类都是用 [Persistent(autoid = true)] 定义的。员工的自动 ID 将存储在部门类的员工引用向量中。我们将像上面的 Autoid_Ref 示例那样,在员工引用字段 dept 中使用部门的自动 ID,以便从员工对象高效地访问部门对象。由于我们需要为部门对象中的动态自动 ID 值向量分配空间,并在插入员工对象时分配这些自动 ID,因此填充数据库的代码会变得稍微复杂一些。所以对于每个员工对象,我们执行以下代码来分配空间并将员工的自动 ID 插入到部门的员工向量中:
// Create and insert Department objects
con.StartTransaction(Database.TransactionType.ReadWrite);
Console.WriteLine("\nCreate Departments:\n");
for (short i = 0; i < N_DEPARTMENTS; i++)
{
Department dept = new Department();
dept.code = DD[i].code;
dept.name = DD[i].name;
// Allocate space for Employee autoids and initialize to 0
dept.employees = new long[VECTOR_SIZE];
for (short j = 0; j < VECTOR_SIZE; j++) dept.employees[j] = 0;
long autoid = con.Insert(dept);
Console.WriteLine("\t" + i + ") " + dept.code + ", " + dept.name +
", Autoid=" + autoid);
}
con.CommitTransaction();
// Create and insert Employee objects
Console.WriteLine("\nCreate Employees and join each to a Department:\n");
for (short i = 0; i < N_EMPLOYEES; i++)
{
// Find the Department object for this Employee by the BTree index on
// Department.code
con.StartTransaction(Database.TransactionType.ReadWrite);
Cursor<Department> cursor = new Cursor<Department>(con, "code");
Department d = cursor.Find(ED[i].deptCode);
if (null != d)
{
// Find the first vacant Employee vector element
for (short j = 0; j < VECTOR_SIZE; j++)
{
if (0 == d.employees[j])
{
// Create Employee object and store its autoid in vector
// Department.employees
Employee emp = new Employee();
emp.name = ED[i].name;
// Assign the Department autoid for this Department object and insert
// new Employee object
emp.dept = cursor.GetAutoId(); // Note that the method GetAutoId()
// is of the Cursor class
d.employees[j] = con.Insert(emp); // Note that Insert() returns the
// Employee autoid
con.CurrentCursor.Update(); // Assure that the inserted autoid
// is made perminent
Console.WriteLine("\t" + i + ") " + emp.name + ", Department.Autoid="
+ emp.dept);
break;
}
}
}
else
{
Console.WriteLine("\tDepartment.code (" + ED[i].deptCode +
") not found!");
}
con.CommitTransaction();
cursor.Close();
}
现在,要显示给定员工所在部门的所有同事的代码,只需像上面的 Autoid_Ref 示例那样通过在不指定索引字段的情况下实例化 Department 类的 Cursor 来查找 Department 对象。然后,我们调用 find 并传入 Employee.dept 的值,通过其自增 ID 查找 Department 对象。然后,我们只需从向量中列出此部门的员工,如下所示:
// Search for all Employee objects from a specified Employee's Department
String search_name = "William";
Console.Write("\n\n" + search_name + "'s co-workers in ");
con.StartTransaction(Database.TransactionType.ReadOnly);
// 1. Find the Employee object by name
Cursor<Employee> cursor1 = new Cursor<Employee>(con, "name");
Employee emp1 = cursor1.Find(search_name);
if ( null != emp1 )
{
// 2. Find the Department object by its autoid and display its name
Cursor<Department> cursor2 = new Cursor<Department>(con);
Department dept1 = cursor2.Find(emp1.dept);
Console.Write(dept1.name + " are:\n");
// 3. Scroll through the vector of Employee autoids, find the Employee
// object and display its name
Cursor<Employee> cursor3 = new Cursor<Employee>(con);
for (short j = 0; j < VECTOR_SIZE && 0 != dept1.employees[j]; j++)
{
// Skip if this is the autoid of the "search_name" object
if (cursor1.GetAutoId() != dept1.employees[j])
{
Employee emp = cursor3.Find(dept1.employees[j]);
Console.WriteLine("\t" + emp.name);
}
}
}
优点和缺点
此实现方式显著减少了内存使用,并通过消除两个 B 树索引大幅减少了后台树结构维护工作,只是以不按字母顺序排列员工列表为代价。不过,要在应用程序中引入排序算法来解决这个问题其实很简单;但需要注意的是,如果每个部门的员工数量非常大,这种方法可能会变得不可行。